Instructions to use ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M",
	filename="Mistral-7B-Instruct-v0.3-Q4_K_M.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M:Q4_K_M

Use Docker

docker model run hf.co/ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M:Q4_K_M

LM Studio
Jan
Ollama
How to use ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M with Ollama:
```
ollama run hf.co/ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M:Q4_K_M
```

Unsloth Studio new

How to use ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M to start chatting

Docker Model Runner
How to use ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M with Docker Model Runner:
```
docker model run hf.co/ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M:Q4_K_M
```

Lemonade

How to use ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M:Q4_K_M

Run and chat with the model

lemonade run user.Mistral-7B-Instruct-v0.3-Q4_K_M-Q4_K_M

List all available models

lemonade list

Mistral-7B-Instruct-v0.3-Q4_K_M (GGUF)

This repository contains the Mistral-7B-Instruct-v0.3 model in GGUF format with Q4_K_M quantization.

Model Information

Base Model: Mistral-7B-Instruct-v0.3
Original Creator: Mistral AI
License: Apache 2.0
Format: GGUF (Quantized)
Quantization: Q4_K_M (4-bit quantization, medium quality)
Model Size: ~4.1 GB

About This Model

Mistral 7B Instruct v0.3 is a instruction-tuned large language model developed by Mistral AI. This version has been quantized to GGUF format for efficient inference with llama.cpp and compatible frameworks.

Key Features

7 billion parameters
Optimized for instruction-following tasks
Supports extended vocabulary
Apache 2.0 licensed (commercial use allowed)

Usage

This model can be used with:

llama.cpp
Ollama
LM Studio
text-generation-webui
Any GGUF-compatible inference engine

Example with llama.cpp:

./main -m Mistral-7B-Instruct-v0.3-Q4_K_M.gguf -p "Your prompt here" -n 512

Example with Python (llama-cpp-python):

from llama_cpp import Llama

llm = Llama(model_path="Mistral-7B-Instruct-v0.3-Q4_K_M.gguf")
output = llm("Q: What is the capital of France? A:", max_tokens=256)
print(output)

Quantization Details

Q4_K_M quantization provides:

Good balance between size and quality
~4-bit average quantization
Suitable for most use cases
Recommended for systems with limited VRAM

License

This model is licensed under Apache License 2.0.

You are free to:

✅ Use commercially
✅ Modify and distribute
✅ Use privately
✅ Patent use

Attribution Required: You must give appropriate credit to Mistral AI, provide a link to the license, and indicate if changes were made.

See the Apache 2.0 License for full details.

Attribution

Original model developed by Mistral AI.

Original Model: mistralai/Mistral-7B-Instruct-v0.3
GGUF Conversion: Quantized for efficient inference

Citation

@article{mistral7b,
  title={Mistral 7B},
  author={Mistral AI Team},
  year={2023}
}

Disclaimer

This model is provided "as is" without warranty of any kind. See the Apache 2.0 license for details.

Downloads last month: 2

GGUF

Model size

7B params

Architecture

llama

Hardware compatibility

4-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ARAVINDS2022002/Mistral-7B-Instruct-v0.3-Q4_K_M

Base model

mistralai/Mistral-7B-v0.3

Finetuned

mistralai/Mistral-7B-Instruct-v0.3

Quantized

(248)

this model