SLA2: Sparse-Linear Attention with Learnable Routing and QAT Paper β’ 2602.12675 β’ Published 8 days ago β’ 49 β’ 4
TurboDiffusion: Accelerating Video Diffusion Models by 100-200 Times Paper β’ 2512.16093 β’ Published Dec 18, 2025 β’ 95 β’ 7
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention Paper β’ 2509.24006 β’ Published Sep 28, 2025 β’ 118 β’ 4
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention Paper β’ 2509.24006 β’ Published Sep 28, 2025 β’ 118 β’ 4
SLA: Beyond Sparsity in Diffusion Transformers via Fine-Tunable Sparse-Linear Attention Paper β’ 2509.24006 β’ Published Sep 28, 2025 β’ 118 β’ 4
Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures Paper β’ 2505.09343 β’ Published May 14, 2025 β’ 76 β’ 5
SageAttention2++: A More Efficient Implementation of SageAttention2 Paper β’ 2505.21136 β’ Published May 27, 2025 β’ 45 β’ 3
SageAttention2++: A More Efficient Implementation of SageAttention2 Paper β’ 2505.21136 β’ Published May 27, 2025 β’ 45 β’ 3
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training Paper β’ 2505.11594 β’ Published May 16, 2025 β’ 75 β’ 8
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training Paper β’ 2505.11594 β’ Published May 16, 2025 β’ 75 β’ 8
SAGE: A Framework of Precise Retrieval for RAG Paper β’ 2503.01713 β’ Published Mar 3, 2025 β’ 7 β’ 2
Identifying Sensitive Weights via Post-quantization Integral Paper β’ 2503.01901 β’ Published Feb 28, 2025 β’ 8 β’ 2