LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment Paper • 2310.01852 • Published Oct 3, 2023 • 2
VisNumBench: Evaluating Number Sense of Multimodal Large Language Models Paper • 2503.14939 • Published Mar 19, 2025 • 5
ReEx-SQL: Reasoning with Execution-Aware Reinforcement Learning for Text-to-SQL Paper • 2505.12768 • Published May 19, 2025 • 5
Evaluating Clinical Competencies of Large Language Models with a General Practice Benchmark Paper • 2503.17599 • Published Mar 22, 2025
FaVChat: Hierarchical Prompt-Query Guided Facial Video Understanding with Data-Efficient GRPO Paper • 2503.09158 • Published Mar 12, 2025 • 1
Enhancing Geometric Perception in VLMs via Translator-Guided Reinforcement Learning Paper • 2602.22703 • Published 28 days ago
Cognitive Mismatch in Multimodal Large Language Models for Discrete Symbol Understanding Paper • 2603.18472 • Published 7 days ago • 19
MIGA: Mutual Information-Guided Attack on Denoising Models for Semantic Manipulation Paper • 2503.06966 • Published Mar 10, 2025
Cognitive Mismatch in Multimodal Large Language Models for Discrete Symbol Understanding Paper • 2603.18472 • Published 7 days ago • 19
Youtu-LLM: Unlocking the Native Agentic Potential for Lightweight Large Language Models Paper • 2512.24618 • Published Dec 31, 2025 • 152
LlamaSeg: Image Segmentation via Autoregressive Mask Generation Paper • 2505.19422 • Published May 26, 2025 • 3
ReEx-SQL: Reasoning with Execution-Aware Reinforcement Learning for Text-to-SQL Paper • 2505.12768 • Published May 19, 2025 • 5
VisNumBench: Evaluating Number Sense of Multimodal Large Language Models Paper • 2503.14939 • Published Mar 19, 2025 • 5
Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training Paper • 2405.15319 • Published May 24, 2024 • 28