Instructions to use IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2")
model = AutoModelForCausalLM.from_pretrained("IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2

SGLang

How to use IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2 with Docker Model Runner:
```
docker model run hf.co/IHaBiS/Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2
```

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

exl2 version of Undi95/Mistral-11B-TestBench3
dataset : wikitext
command : python convert.py -i models/Undi95_Mistral-11B-TestBench3 -o Undi95_Mistral-11B-TestBench3-temp -cf Undi95_Mistral-11B-TestBench3-6.0bpw-h8-exl2 -c 0000.parquet -l 4096 -b 6 -hb 8 -ss 4096

Under this sentence is original model card.

slices:
  - sources:
    - model: Norquinal/Mistral-7B-claude-chat
      layer_range: [0, 24]
  - sources:
    - model: Open-Orca/Mistral-7B-OpenOrca
      layer_range: [8, 32]
merge_method: passthrough
dtype: float16

========================================================

slices:
  - sources:
      - model: Undi95/Mistral-11B-CC-Air
        layer_range: [0, 48]
      - model: "/content/drive/MyDrive/Mistral-11B-ClaudeOrca"
        layer_range: [0, 48]
merge_method: slerp
base_model: Undi95/Mistral-11B-CC-Air
parameters:
  t:
    - value: 0.5 # fallback for rest of tensors
dtype: float16

hf-causal-experimental (pretrained=/content/drive/MyDrive/Mistral-11B-Test), limit: None, provide_description: False, num_fewshot: 0, batch_size: 4

Task	Version	Metric	Value		Stderr
arc_challenge	0	acc	0.5401	±	0.0146
		acc_norm	0.5589	±	0.0145
arc_easy	0	acc	0.8199	±	0.0079
		acc_norm	0.8127	±	0.0080
hellaswag	0	acc	0.6361	±	0.0048
		acc_norm	0.8202	±	0.0038
piqa	0	acc	0.8079	±	0.0092
		acc_norm	0.8199	±	0.0090
truthfulqa_mc	1	mc1	0.3733	±	0.0169
		mc2	0.5374	±	0.0156
winogrande	0	acc	0.7261	±	0.0125

Downloads last month: 2