4 8 1

Bohan Zhai PRO

Borise

AI & ML interests

LLM, Audio, NLP, 3D vision, vision language

Recent Activity

upvoted a paper 8 days ago

VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs

updated a dataset 4 months ago

Borise/CaptionQA

new activity 4 months ago

Borise/CaptionQA:Update README.md

View all activity

Organizations

upvoted a paper 8 days ago

VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs

Paper • 2603.23481 • Published 11 days ago • 6

updated a dataset 4 months ago

Borise/CaptionQA

Viewer • Updated Dec 13, 2025 • 657 • 2.09k • 5

New activity in Borise/CaptionQA 4 months ago

Update README.md

#3 opened 4 months ago by

shijiay

liked a dataset 4 months ago

Borise/CaptionQA

Viewer • Updated Dec 13, 2025 • 657 • 2.09k • 5

commented a paper 4 months ago

CaptionQA: Is Your Caption as Useful as the Image Itself?

Paper • 2511.21025 • Published Nov 26, 2025 • 28 •

upvoted an article 4 months ago

Article

📌 Rethinking Multimodality from an Industry Perspective: Captioning Is Far More Important Than You Think

Nov 29, 2025

•

published an article 4 months ago

Article

📌 Rethinking Multimodality from an Industry Perspective: Captioning Is Far More Important Than You Think

Nov 29, 2025

•

authored 5 papers 4 months ago

HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption

Paper • 2310.01779 • Published Oct 3, 2023 • 4

CORE-MM: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models

Paper • 2311.11567 • Published Nov 20, 2023 • 8

upvoted a paper 4 months ago

CaptionQA: Is Your Caption as Useful as the Image Itself?

Paper • 2511.21025 • Published Nov 26, 2025 • 28

published a dataset 4 months ago

Borise/CaptionQA

Viewer • Updated Dec 13, 2025 • 657 • 2.09k • 5

updated a model 10 months ago

Borise/llava_qwen2_dit_stage2_14B

15B • Updated May 30, 2025 • 2

published a model 10 months ago

Borise/llava_qwen2_dit_stage2_14B

15B • Updated May 30, 2025 • 2

updated a model 10 months ago

Borise/llava_qwen2_dino224_stage2_14B

15B • Updated May 30, 2025 • 1

published a model 10 months ago

Borise/llava_qwen2_dino224_stage2_14B

15B • Updated May 30, 2025 • 1

updated a model 10 months ago

Borise/llava_qwen2_clip336_stage2_14B

15B • Updated May 30, 2025 • 2

published a model 10 months ago

Borise/llava_qwen2_clip336_stage2_14B

15B • Updated May 30, 2025 • 2

Bohan Zhai PRO

AI & ML interests

Recent Activity

Organizations

Borise's activity

Update README.md

📌 Rethinking Multimodality from an Industry Perspective: Captioning Is Far More Important Than You Think

📌 Rethinking Multimodality from an Industry Perspective: Captioning Is Far More Important Than You Think