Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2502.08606

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 155
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20, 2024 • 14
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 59
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24, 2024 • 47

Slamming: Training a Speech Language Model on One GPU in a Day

Paper • 2502.15814 • Published Feb 19, 2025 • 69
Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published Feb 17, 2025 • 40
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading

Paper • 2502.12574 • Published Feb 18, 2025 • 13
Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14, 2025 • 128

Redundancy Principles for MLLMs Benchmarks

Paper • 2501.13953 • Published Jan 20, 2025 • 29
Autonomy-of-Experts Models

Paper • 2501.13074 • Published Jan 22, 2025 • 44
Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47
Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14, 2025 • 128

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17, 2025 • 115
PaSa: An LLM Agent for Comprehensive Academic Paper Search

Paper • 2501.10120 • Published Jan 17, 2025 • 55
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong

Paper • 2501.09775 • Published Jan 16, 2025 • 32
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Paper • 2501.10132 • Published Jan 17, 2025 • 22

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Paper • 2502.09604 • Published Feb 13, 2025 • 37
Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13, 2025 • 150

Knowledge Distillation

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47
MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published Oct 22, 2024 • 16

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published Feb 20, 2025 • 47
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning

Paper • 2502.12853 • Published Feb 18, 2025 • 29
Diverse Inference and Verification for Advanced Reasoning

Paper • 2502.09955 • Published Feb 14, 2025 • 18
Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47

Papers Storm 🌪️

A curated collection of research papers referenced in Panoram'IA program, offering a comprehensive resource for further exploration.

GenEx: Generating an Explorable World

Paper • 2412.09624 • Published Dec 12, 2024 • 98
Generative World Explorer

Paper • 2411.11844 • Published Nov 18, 2024 • 77
Video Depth without Video Models

Paper • 2411.19189 • Published Nov 28, 2024 • 39
Mobile Video Diffusion

Paper • 2412.07583 • Published Dec 10, 2024 • 20

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47

Theory, Conceptualization, Paradigms

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47
I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published Apr 23, 2025 • 31
Chain-of-Model Learning for Language Model

Paper • 2505.11820 • Published May 17, 2025 • 121
Nested Learning: The Illusion of Deep Learning Architectures

Paper • 2512.24695 • Published Dec 31, 2025 • 46

Self-Rewarding Language Models

Paper • 2401.10020 • Published Jan 18, 2024 • 155
Orion-14B: Open-source Multilingual Large Language Models

Paper • 2401.12246 • Published Jan 20, 2024 • 14
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 59
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24, 2024 • 47

Knowledge Distillation

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47
MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published Oct 22, 2024 • 16

Slamming: Training a Speech Language Model on One GPU in a Day

Paper • 2502.15814 • Published Feb 19, 2025 • 69
Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published Feb 17, 2025 • 40
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading

Paper • 2502.12574 • Published Feb 18, 2025 • 13
Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14, 2025 • 128

Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning

Paper • 2502.14768 • Published Feb 20, 2025 • 47
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning

Paper • 2502.12853 • Published Feb 18, 2025 • 29
Diverse Inference and Verification for Advanced Reasoning

Paper • 2502.09955 • Published Feb 14, 2025 • 18
Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47

Redundancy Principles for MLLMs Benchmarks

Paper • 2501.13953 • Published Jan 20, 2025 • 29
Autonomy-of-Experts Models

Paper • 2501.13074 • Published Jan 22, 2025 • 44
Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47
Large Language Diffusion Models

Paper • 2502.09992 • Published Feb 14, 2025 • 128

Papers Storm 🌪️

A curated collection of research papers referenced in Panoram'IA program, offering a comprehensive resource for further exploration.

GenEx: Generating an Explorable World

Paper • 2412.09624 • Published Dec 12, 2024 • 98
Generative World Explorer

Paper • 2411.11844 • Published Nov 18, 2024 • 77
Video Depth without Video Models

Paper • 2411.19189 • Published Nov 28, 2024 • 39
Mobile Video Diffusion

Paper • 2412.07583 • Published Dec 10, 2024 • 20

Evolving Deeper LLM Thinking

Paper • 2501.09891 • Published Jan 17, 2025 • 115
PaSa: An LLM Agent for Comprehensive Academic Paper Search

Paper • 2501.10120 • Published Jan 17, 2025 • 55
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong

Paper • 2501.09775 • Published Jan 16, 2025 • 32
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

Paper • 2501.10132 • Published Jan 17, 2025 • 22

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Paper • 2502.09604 • Published Feb 13, 2025 • 37
Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU

Paper • 2502.08910 • Published Feb 13, 2025 • 150

Theory, Conceptualization, Paradigms

Distillation Scaling Laws

Paper • 2502.08606 • Published Feb 12, 2025 • 47
I-Con: A Unifying Framework for Representation Learning

Paper • 2504.16929 • Published Apr 23, 2025 • 31
Chain-of-Model Learning for Language Model

Paper • 2505.11820 • Published May 17, 2025 • 121
Nested Learning: The Illusion of Deep Learning Architectures

Paper • 2512.24695 • Published Dec 31, 2025 • 46

Previous
1
2
Next

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs