Visual Question Answering
Transformers
Safetensors
English
videollama2_qwen2
text-generation
Audio-visual Question Answering
Audio Question Answering
multimodal large language model
Instructions to use lym0302/VideoLLaMA2.1-7B-AV-QA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use lym0302/VideoLLaMA2.1-7B-AV-QA with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("visual-question-answering", model="lym0302/VideoLLaMA2.1-7B-AV-QA")# Load model directly from transformers import AutoModelForCausalLM model = AutoModelForCausalLM.from_pretrained("lym0302/VideoLLaMA2.1-7B-AV-QA", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Xet hash:
- 024202e5388a6323e1ecc7d4743e8897a935ef617ad4aa345d3d32a8aa9569d6
- Size of remote file:
- 31.2 MB
- SHA256:
- 63f46c2dabfef0675a69e80fcef9461fc7d6063db1eaf6179315ea8be3feeb9b
·
Xet efficiently stores Large Files inside Git, intelligently splitting files into unique chunks and accelerating uploads and downloads. More info.