Instructions to use Alibaba-NLP/Tongyi-DeepResearch-30B-A3B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Alibaba-NLP/Tongyi-DeepResearch-30B-A3B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Alibaba-NLP/Tongyi-DeepResearch-30B-A3B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Alibaba-NLP/Tongyi-DeepResearch-30B-A3B")
model = AutoModelForCausalLM.from_pretrained("Alibaba-NLP/Tongyi-DeepResearch-30B-A3B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Alibaba-NLP/Tongyi-DeepResearch-30B-A3B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Alibaba-NLP/Tongyi-DeepResearch-30B-A3B

SGLang

How to use Alibaba-NLP/Tongyi-DeepResearch-30B-A3B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Alibaba-NLP/Tongyi-DeepResearch-30B-A3B with Docker Model Runner:
```
docker model run hf.co/Alibaba-NLP/Tongyi-DeepResearch-30B-A3B
```

Improve model card: Associate with WebWeaver paper and add Quick Start

by nielsr HF Staff - opened Sep 18, 2025

base: refs/heads/main

←

from: refs/pr/5

Discussion Files changed

+135

-7

initial commit1b24d8d2

Upload folder using huggingface_hubf991b90a

Update README.mda92f778c

Delete mergekit_config.ymlb958b177

Update README.md991ea16d

Upload vocab.jsone65d87ae

Update README.mdcbb31948

Update README.md70fd758d

Adding `transformers` as the library tag for better visibility. (#1)99d23cdd

nielsr

Sep 18, 2025

This PR updates the model card for the Tongyi-DeepResearch-30B-A3B model to accurately reflect its association with the WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research paper.

Specifically, it:

Links directly to the associated WebWeaver paper.
Adds explicit links to the project blog and GitHub repository for easy access.
Incorporates a detailed "Quick Start" section with environment setup, installation, and inference instructions, directly sourced from the GitHub repository.
Updates the "Model Download" section with a clear table of download links.
Includes relevant benchmark results and a list of related papers from the Deep Research Agent Family.
Adds comprehensive "News", "Misc", "Talent Recruitment", and "Contact Information" sections from the GitHub README.
Adds a BibTeX citation for the WebWeaver paper alongside the existing model citation.

These changes enhance the model card's completeness, usability, and discoverability.

Improve model card: Associate with WebWeaver paper and add Quick Startb7380a5b

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Cannot merge

This branch has merge conflicts in the following files:

README.md

· Sign up or log in to comment