Improving Fake News Detection by Using an Entity-enhanced Framework to Fuse Diverse Multimodal Clues Paper • 2108.10509 • Published Aug 24, 2021
Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models Paper • 2308.13437 • Published Aug 25, 2023 • 4
CODIS: Benchmarking Context-Dependent Visual Comprehension for Multimodal Large Language Models Paper • 2402.13607 • Published Feb 21, 2024
Video Generation Models Are Good Latent Reward Models Paper • 2511.21541 • Published Nov 26, 2025 • 45