ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation Paper • 2601.21420 • Published 5 days ago • 40
KaVa: Latent Reasoning via Compressed KV-Cache Distillation Paper • 2510.02312 • Published Oct 2, 2025 • 2
view article Article Provence: efficient and robust context pruning for retrieval-augmented generation Jan 28, 2025 • 25
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates Paper • 2502.06772 • Published Feb 10, 2025 • 21
FastKV: KV Cache Compression for Fast Long-Context Processing with Token-Selective Propagation Paper • 2502.01068 • Published Feb 3, 2025 • 18
mHuBERT-147 models Collection Compact yet powerful multilingual speech representation models based on the HuBERT architecture. • 3 items • Updated Jun 4, 2024 • 8
LLM Self-Correction with DeCRIM: Decompose, Critique, and Refine for Enhanced Following of Instructions with Multiple Constraints Paper • 2410.06458 • Published Oct 9, 2024 • 8
Multilingual DistilWhisper Collection Multilingual Distilwhisper allows for better ASR performance in target languages by adding lightweight CLSR modules on top of whisper-small. • 3 items • Updated Mar 18, 2024 • 6
DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts Paper • 2311.01070 • Published Nov 2, 2023 • 3
ZeroBERTo: Leveraging Zero-Shot Text Classification by Topic Modeling Paper • 2201.01337 • Published Jan 4, 2022 • 2