Creative Writing Datasets Collection High-quality creative writing and storytelling data. • 31 items • Updated 1 day ago • 3
Instruction & Reasoning Collection Datasets for instruction following, code, and reasoning. • 11 items • Updated 2 days ago • 5
SpargeAttention2: Trainable Sparse Attention via Hybrid Top-k+Top-p Masking and Distillation Fine-Tuning Paper • 2602.13515 • Published 9 days ago • 39
jina-embeddings-v5-text Collection Our 5th-gen embeddings: two lightweight multilingual models with SOTA performance in retrieval, matching, clustering, and classification. • 23 items • Updated 4 days ago • 28
DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories Paper • 2602.10809 • Published 11 days ago • 51
SQuTR: A Robustness Benchmark for Spoken Query to Text Retrieval under Acoustic Noise Paper • 2602.12783 • Published 10 days ago • 140
The Devil Behind Moltbook: Anthropic Safety is Always Vanishing in Self-Evolving AI Societies Paper • 2602.09877 • Published 12 days ago • 190
view article Article OpenEnv in Practice: Evaluating Tool-Using Agents in Real-World Environments +3 11 days ago • 27
Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters Paper • 2602.10604 • Published 12 days ago • 182
OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 18 days ago • 321
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published 14 days ago • 261
view article Article Community Evals: Because we're done trusting black-box leaderboards over the community +5 19 days ago • 75