-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
Collections
Discover the best community collections!
Collections including paper arxiv:2503.12937
-
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
Learning from Failures in Multi-Attempt Reinforcement Learning
Paper • 2503.04808 • Published • 18 -
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30
-
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 263 -
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs
Paper • 2507.07996 • Published • 34 -
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities
Paper • 2507.13158 • Published • 23
-
Towards Self-Improving Systematic Cognition for Next-Generation Foundation MLLMs
Paper • 2503.12303 • Published • 7 -
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Paper • 2503.12797 • Published • 32 -
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30 -
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research
Paper • 2503.13399 • Published • 22
-
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Paper • 2503.12605 • Published • 35 -
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30 -
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Paper • 2503.12271 • Published • 9 -
Video-T1: Test-Time Scaling for Video Generation
Paper • 2503.18942 • Published • 90
-
Personalize Anything for Free with Diffusion Transformer
Paper • 2503.12590 • Published • 44 -
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30 -
Exploring the Vulnerabilities of Federated Learning: A Deep Dive into Gradient Inversion Attacks
Paper • 2503.11514 • Published • 18 -
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
Paper • 2502.19328 • Published • 23
-
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning
Paper • 2502.14768 • Published • 47 -
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Paper • 2502.12853 • Published • 29 -
Diverse Inference and Verification for Advanced Reasoning
Paper • 2502.09955 • Published • 18 -
Distillation Scaling Laws
Paper • 2502.08606 • Published • 47
-
EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters
Paper • 2402.04252 • Published • 29 -
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models
Paper • 2402.03749 • Published • 14 -
ScreenAI: A Vision-Language Model for UI and Infographics Understanding
Paper • 2402.04615 • Published • 44 -
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss
Paper • 2402.05008 • Published • 23
-
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 263 -
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs
Paper • 2507.07996 • Published • 34 -
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities
Paper • 2507.13158 • Published • 23
-
Towards Self-Improving Systematic Cognition for Next-Generation Foundation MLLMs
Paper • 2503.12303 • Published • 7 -
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Paper • 2503.12797 • Published • 32 -
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30 -
MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research
Paper • 2503.13399 • Published • 22
-
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey
Paper • 2503.12605 • Published • 35 -
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30 -
Reflect-DiT: Inference-Time Scaling for Text-to-Image Diffusion Transformers via In-Context Reflection
Paper • 2503.12271 • Published • 9 -
Video-T1: Test-Time Scaling for Video Generation
Paper • 2503.18942 • Published • 90
-
Personalize Anything for Free with Diffusion Transformer
Paper • 2503.12590 • Published • 44 -
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30 -
Exploring the Vulnerabilities of Federated Learning: A Deep Dive into Gradient Inversion Attacks
Paper • 2503.11514 • Published • 18 -
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems
Paper • 2502.19328 • Published • 23
-
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
Learning from Failures in Multi-Attempt Reinforcement Learning
Paper • 2503.04808 • Published • 18 -
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30
-
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning
Paper • 2502.14768 • Published • 47 -
S^2R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
Paper • 2502.12853 • Published • 29 -
Diverse Inference and Verification for Advanced Reasoning
Paper • 2502.09955 • Published • 18 -
Distillation Scaling Laws
Paper • 2502.08606 • Published • 47