|
|
--- |
|
|
license: apple-amlr |
|
|
base_model: |
|
|
- mistralai/Mistral-7B-Instruct-v0.2 |
|
|
tags: |
|
|
- rag |
|
|
- compression |
|
|
- retrieval |
|
|
- instruction-tuned |
|
|
- generation |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning |
|
|
|
|
|
<div align="center"> |
|
|
<img src="clara_logo.jpg" width="300"/> |
|
|
</div> |
|
|
|
|
|
<div align="center"> |
|
|
<a href="https://arxiv.org/abs/2511.18659"><img src="https://img.shields.io/badge/arXiv-2511.18659-b31b1b.svg" alt="arXiv"></a> |
|
|
<a href="https://arxiv.org/abs/2511.18659"><img src="https://img.shields.io/badge/Paper-PDF-red.svg" alt="Paper"></a> |
|
|
<a href="https://github.com/apple/ml-clara"><img src="https://img.shields.io/badge/GitHub-Code-blue.svg" alt="GitHub"></a> |
|
|
</div> |
|
|
|
|
|
|
|
|
# CLaRa-7B-Instruct (Compression-16 & 128) |
|
|
|
|
|
The **CLaRa-7B-Instruct** model is our instruction-tuned unified RAG model with built-in semantic document compression (16× & 128x). |
|
|
It supports instruction-following QA directly from compressed document representations. |
|
|
|
|
|
**Training recipe:** Instruction tuning on QA-style tasks built on top of the base semantic compression model. |
|
|
**Benchmarks:** Strong instruction-following performance under 16× compression. |
|
|
|
|
|
--- |
|
|
|
|
|
## More details and usage examples: |
|
|
|
|
|
Paper: [CLaRa: Bridging Retrieval and Generation with Continuous Latent Reasoning](https://arxiv.org/abs/2511.18659) |
|
|
GitHub: https://github.com/apple/ml-clara |
|
|
|
|
|
Video (from @Fahd Mirza): https://youtu.be/al2VoAKn8GU?si=Q8bq7QNMaTvcArwa |
|
|
|
|
|
|
|
|
--- |
|
|
|
|
|
## Example Usage (Instruction-Tuned Inference) |
|
|
|
|
|
```python |
|
|
from transformers import AutoModel |
|
|
|
|
|
unirag = AutoModel.from_pretrained( |
|
|
"/mnt/ceph_rbd/model/CLaRa-7B-Instruct/compression-16", |
|
|
trust_remote_code=True |
|
|
).to("cuda") |
|
|
|
|
|
documents = [ |
|
|
[ |
|
|
"Weldenia is a monotypic genus of flowering plant in the family Commelinaceae...", |
|
|
"Hagsatera is a genus of flowering plants from the orchid family...", |
|
|
"Alsobia is a genus of flowering plants in the family Gesneriaceae..." |
|
|
] |
|
|
] |
|
|
|
|
|
questions = [ |
|
|
"Which genus of plant grows originally in Mexico and Guatemala, Phylica or Weldenia?" |
|
|
] |
|
|
|
|
|
# Instruction-tuned usage |
|
|
out = unirag.generate_from_text( |
|
|
questions=questions, |
|
|
documents=documents, |
|
|
max_new_tokens=64 |
|
|
) |
|
|
|
|
|
print("Generated answer:", out) |