Towards Scalable Pre-training of Visual Tokenizers for Generation Paper • 2512.13687 • Published 18 days ago • 98
VCU-Bridge: Hierarchical Visual Connotation Understanding via Semantic Bridging Paper • 2511.18121 • Published Nov 22, 2025 • 1
microsoft/Phi-4-multimodal-instruct Automatic Speech Recognition • 6B • Updated 23 days ago • 248k • 1.55k