Instructions to use nvidia/KVzap-linear-Llama-3.1-8B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nvidia/KVzap-linear-Llama-3.1-8B-Instruct with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("nvidia/KVzap-linear-Llama-3.1-8B-Instruct", dtype="auto") - Notebooks
- Google Colab
- Kaggle
docs: update readme to include GitHub url
Browse filesChanged KVzap GitHub repo to: https://github.com/NVIDIA/kvpress/tree/main/kvzap
README.md
CHANGED
|
@@ -13,7 +13,7 @@ track_downloads: true
|
|
| 13 |
|
| 14 |
|
| 15 |
[](https://www.apache.org/licenses/LICENSE-2.0)
|
| 16 |
-
[](https://github.com/NVIDIA/kvpress/kvzap)
|
| 17 |
[](https://huggingface.co/collections/nvidia/kvzap)
|
| 18 |
[](https://arxiv.org/abs/2601.07891)
|
| 19 |
|
|
|
|
| 13 |
|
| 14 |
|
| 15 |
[](https://www.apache.org/licenses/LICENSE-2.0)
|
| 16 |
+
[](https://github.com/NVIDIA/kvpress/tree/main/kvzap)
|
| 17 |
[](https://huggingface.co/collections/nvidia/kvzap)
|
| 18 |
[](https://arxiv.org/abs/2601.07891)
|
| 19 |
|