-
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper • 2310.16818 • Published • 33 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 51 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 58 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper • 2401.14196 • Published • 68
Collections
Discover the best community collections!
Collections including paper arxiv:2511.22570
-
meta-llama/Llama-Guard-3-8B
Text Generation • 8B • Updated • 41.3k • • 256 -
Jailbroken: How Does LLM Safety Training Fail?
Paper • 2307.02483 • Published • 14 -
Constitutional AI: Harmlessness from AI Feedback
Paper • 2212.08073 • Published • 4 -
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Paper • 2312.06674 • Published • 8
-
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 221 -
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models
Paper • 2511.23319 • Published • 22 -
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information
Paper • 2511.22176 • Published • 4 -
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
Paper • 2511.22265 • Published • 1
-
Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
Paper • 2510.20150 • Published • 4 -
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 132 -
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
Paper • 2508.10433 • Published • 144 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 96
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.71k • 1.24k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 358 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
Small Language Models are the Future of Agentic AI
Paper • 2506.02153 • Published • 23 -
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
Paper • 2512.02556 • Published • 244 -
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning
Paper • 2511.22570 • Published • 86 -
DeepSeek-OCR: Contexts Optical Compression
Paper • 2510.18234 • Published • 86
-
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Paper • 2511.16334 • Published • 92 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 101 -
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute
Paper • 2509.04475 • Published • 3 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 96
-
Language Models Model Language
Paper • 2510.12766 • Published • 25 -
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 538 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 501 -
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 221
-
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Paper • 2310.16818 • Published • 33 -
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
Paper • 2401.02954 • Published • 51 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 58 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper • 2401.14196 • Published • 68
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.71k • 1.24k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 358 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
meta-llama/Llama-Guard-3-8B
Text Generation • 8B • Updated • 41.3k • • 256 -
Jailbroken: How Does LLM Safety Training Fail?
Paper • 2307.02483 • Published • 14 -
Constitutional AI: Harmlessness from AI Feedback
Paper • 2212.08073 • Published • 4 -
Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations
Paper • 2312.06674 • Published • 8
-
Small Language Models are the Future of Agentic AI
Paper • 2506.02153 • Published • 23 -
DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models
Paper • 2512.02556 • Published • 244 -
DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning
Paper • 2511.22570 • Published • 86 -
DeepSeek-OCR: Contexts Optical Compression
Paper • 2510.18234 • Published • 86
-
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 221 -
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models
Paper • 2511.23319 • Published • 22 -
Focused Chain-of-Thought: Efficient LLM Reasoning via Structured Input Information
Paper • 2511.22176 • Published • 4 -
FedRE: A Representation Entanglement Framework for Model-Heterogeneous Federated Learning
Paper • 2511.22265 • Published • 1
-
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Paper • 2511.16334 • Published • 92 -
Parallel-R1: Towards Parallel Thinking via Reinforcement Learning
Paper • 2509.07980 • Published • 101 -
ParaThinker: Native Parallel Thinking as a New Paradigm to Scale LLM Test-time Compute
Paper • 2509.04475 • Published • 3 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 96
-
Rank-GRPO: Training LLM-based Conversational Recommender Systems with Reinforcement Learning
Paper • 2510.20150 • Published • 4 -
Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B
Paper • 2511.06221 • Published • 132 -
We-Math 2.0: A Versatile MathBook System for Incentivizing Visual Mathematical Reasoning
Paper • 2508.10433 • Published • 144 -
Stabilizing Reinforcement Learning with LLMs: Formulation and Practices
Paper • 2512.01374 • Published • 96
-
Language Models Model Language
Paper • 2510.12766 • Published • 25 -
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain
Paper • 2509.26507 • Published • 538 -
Less is More: Recursive Reasoning with Tiny Networks
Paper • 2510.04871 • Published • 501 -
Scaling Latent Reasoning via Looped Language Models
Paper • 2510.25741 • Published • 221