Instructions to use lianghsun/gemma-3-tw-270m-thinking with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use lianghsun/gemma-3-tw-270m-thinking with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="lianghsun/gemma-3-tw-270m-thinking")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("lianghsun/gemma-3-tw-270m-thinking")
model = AutoModelForCausalLM.from_pretrained("lianghsun/gemma-3-tw-270m-thinking")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use lianghsun/gemma-3-tw-270m-thinking with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "lianghsun/gemma-3-tw-270m-thinking"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lianghsun/gemma-3-tw-270m-thinking",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/lianghsun/gemma-3-tw-270m-thinking

SGLang

How to use lianghsun/gemma-3-tw-270m-thinking with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "lianghsun/gemma-3-tw-270m-thinking" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lianghsun/gemma-3-tw-270m-thinking",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "lianghsun/gemma-3-tw-270m-thinking" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "lianghsun/gemma-3-tw-270m-thinking",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use lianghsun/gemma-3-tw-270m-thinking with Docker Model Runner:
```
docker model run hf.co/lianghsun/gemma-3-tw-270m-thinking
```

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Model Card for gemma-3-tw-270m-thinking

gemma-3-tw-270m-thinking 是 gemma-3-tw-270m-it 的 thinking（推理）版本：在指令微調版的基礎上加入帶 <think>...</think> 思考段落的訓練資料，使模型能在回答前先輸出思考過程，提升其在多步推理、條件判斷等任務上的穩定度，同時維持 270M 級的小尺寸。

⚠️ 規格重點：本模型為 270M 參數 SLM、純文本單模態，回應前段為 <think>...</think> 推理區段，後段為最終答案。

Model Details

小規模語言模型在多步推理任務上常顯吃力。本模型嘗試把「先思考、再回答」的格式直接寫入 SFT 訓練資料，讓 270M 級的小模型也能模仿大型 reasoning model 的回答結構。雖然受限於模型容量、推理深度仍有上限，但對於日常條件判斷、簡單規則推理、結構化輸出已能提供顯著改善。

核心特點 (Key Features)

270M 級的 reasoning 能力：保留小尺寸優勢，同時透過 <think> 訓練格式取得可解釋的推理步驟。
端側可部署：適用於需要 reasoning 步驟、又不能上雲的場景。
可下游微調：作為小型 reasoning chatbot、教學助理等應用的基底。

Model Description

Developed by: Liang Hsun Huang
Funded by: APMIC
Base model: lianghsun/gemma-3-tw-270m-it
Model type: Gemma3ForCausalLM (Transformers)
Language(s) (NLP): Traditional Chinese, English
License: gemma (Google usage license)
Finetuned from model: lianghsun/gemma-3-tw-270m-it

Model Sources

Repository: lianghsun/gemma-3-tw-270m-thinking

Citation

@misc{gemma_3_tw_270m_thinking,
  title        = {gemma-3-tw-270m-thinking: A Reasoning-style Lightweight Traditional Chinese Model for Taiwan},
  author       = {Huang, Liang Hsun},
  year         = {2025},
  howpublished = {\url{https://huggingface.co/lianghsun/gemma-3-tw-270m-thinking}}
}