Instructions to use ValiantLabs/gpt-oss-20b-ShiningValiant3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ValiantLabs/gpt-oss-20b-ShiningValiant3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ValiantLabs/gpt-oss-20b-ShiningValiant3") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ValiantLabs/gpt-oss-20b-ShiningValiant3") model = AutoModelForCausalLM.from_pretrained("ValiantLabs/gpt-oss-20b-ShiningValiant3") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use ValiantLabs/gpt-oss-20b-ShiningValiant3 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ValiantLabs/gpt-oss-20b-ShiningValiant3" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ValiantLabs/gpt-oss-20b-ShiningValiant3", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/ValiantLabs/gpt-oss-20b-ShiningValiant3
- SGLang
How to use ValiantLabs/gpt-oss-20b-ShiningValiant3 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ValiantLabs/gpt-oss-20b-ShiningValiant3" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ValiantLabs/gpt-oss-20b-ShiningValiant3", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ValiantLabs/gpt-oss-20b-ShiningValiant3" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ValiantLabs/gpt-oss-20b-ShiningValiant3", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use ValiantLabs/gpt-oss-20b-ShiningValiant3 with Docker Model Runner:
docker model run hf.co/ValiantLabs/gpt-oss-20b-ShiningValiant3
Support our open-source dataset and model releases!
Shining Valiant 3: Qwen3-1.7B, Qwen3-4B, Qwen3-8B, Ministral-3-14B-Reasoning-2512, gpt-oss-20b
Shining Valiant 3 is a science, AI design, and general reasoning specialist built on gpt-oss-20b.
- Finetuned on our newest science reasoning data generated with Deepseek R1 0528!
- AI to build AI: our high-difficulty AI reasoning data makes Shining Valiant 3 your friend for building with current AI tech and discovering new innovations and improvements!
- Improved general and creative reasoning to supplement problem-solving and general chat performance.
- Small model sizes allow running on local desktop and mobile, plus super-fast server inference!
Prompting Guide
Shining Valiant 3 uses the gpt-oss-20b prompt format.
Shining Valiant 3 is a reasoning finetune; reasoning level high is generally recommended.
NOTE: This release of Shining Valiant 3 currently uses bf16 for all parameters. Consider quantized models if you're not looking to use bf16.
Example inference script provided by gpt-oss-20b to get started:
from transformers import pipeline
import torch
model_id = "ValiantLabs/gpt-oss-20b-ShiningValiant3"
pipe = pipeline(
"text-generation",
model=model_id,
torch_dtype="auto",
device_map="auto",
)
messages = [
{"role": "user", "content": "Reversible Cellular Automata (RCAs) are CAs that have an inverse rule, allowing the simulation to run backward in time. Explain the theoretical significance of RCAs in the context of modeling physical laws that are time-symmetric. Describe the additional constraints that must be placed on a rule set to ensure it is reversible and discuss the challenges in constructing non-trivial reversible rules."},
]
outputs = pipe(
messages,
max_new_tokens=12000,
)
print(outputs[0]["generated_text"][-1])
Shining Valiant 3 is created by Valiant Labs.
Check out our HuggingFace page to see all of our models!
We care about open source. For everyone to use.
- Downloads last month
- 16

