22 25 6

Jintao Zhang

jt-zhang

https://jt-zhang.github.io/

jt-zhang

AI & ML interests

Efficient ML

Recent Activity

authored a paper 1 day ago

SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning

updated a collection 1 day ago

efficient ml

upvoted a paper 1 day ago

SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning

View all activity

Organizations

commented a paper 3 days ago

SLA2: Sparse-Linear Attention with Learnable Routing and QAT

Paper • 2602.12675 • Published 8 days ago • 49 •

commented a paper about 2 months ago

TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times

Paper • 2512.16093 • Published Dec 18, 2025 • 95 •

New activity in TurboDiffusion/TurboWan2.2-I2V-A14B-720P 2 months ago

Anyone converted these to safetensors?

#5 opened 2 months ago by

VainGuard

GitHub link 404

#1 opened 2 months ago by

aergo

commented 3 papers 5 months ago

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

Paper • 2509.24006 • Published Sep 28, 2025 • 118 •

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

Paper • 2509.24006 • Published Sep 28, 2025 • 118 •

SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention

Paper • 2509.24006 • Published Sep 28, 2025 • 118 •

New activity in huggingface/HuggingDiscussions 6 months ago

[FEEDBACK] Daily Papers

🔥 ❤️ 21

177

#32 opened over 1 year ago by

kramp

New activity in jt-zhang/SageAttention3 7 months ago

Any improvement on Ada Lovelace (RTX 4xxx) ?

👀 1

#1 opened 7 months ago by

NielsGx

New activity in jt-zhang/SageAttention2_plus 8 months ago

The performance of sageattention2.2 is worse than sageattention2.1.

#7 opened 8 months ago by

triplemu

5090

#6 opened 8 months ago by

xiaomingxu1995

In the latest commit, we set the default sageattn API to SageAttn2++

🚀 ❤️ 2

#5 opened 8 months ago by

jt-zhang

SageAttention2++ needs CUDA 12.8

🔥 1

#3 opened 8 months ago by

jt-zhang

commented 2 papers 8 months ago

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Paper • 2505.09343 • Published May 14, 2025 • 76 •

SageAttention2++: A More Efficient Implementation of SageAttention2

Paper • 2505.21136 • Published May 27, 2025 • 45 •

commented 3 papers 9 months ago

SageAttention2++: A More Efficient Implementation of SageAttention2

Paper • 2505.21136 • Published May 27, 2025 • 45 •

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Paper • 2505.11594 • Published May 16, 2025 • 75 •

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Paper • 2505.11594 • Published May 16, 2025 • 75 •

commented 2 papers 12 months ago

SAGE: A Framework of Precise Retrieval for RAG

Paper • 2503.01713 • Published Mar 3, 2025 • 7 •

Identifying Sensitive Weights via Post-quantization Integral

Paper • 2503.01901 • Published Feb 28, 2025 • 8 •

Jintao Zhang

AI & ML interests

Recent Activity

Organizations

jt-zhang's activity

Anyone converted these to safetensors?

GitHub link 404

[FEEDBACK] Daily Papers

Any improvement on Ada Lovelace (RTX 4xxx) ?

The performance of sageattention2.2 is worse than sageattention2.1.

5090

In the latest commit, we set the default sageattn API to SageAttn2++

SageAttention2++ needs CUDA 12.8