Instructions to use MIT/ast-finetuned-audioset-16-16-0.442 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MIT/ast-finetuned-audioset-16-16-0.442 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("audio-classification", model="MIT/ast-finetuned-audioset-16-16-0.442")# Load model directly from transformers import AutoFeatureExtractor, AutoModelForAudioClassification extractor = AutoFeatureExtractor.from_pretrained("MIT/ast-finetuned-audioset-16-16-0.442") model = AutoModelForAudioClassification.from_pretrained("MIT/ast-finetuned-audioset-16-16-0.442") - Notebooks
- Google Colab
- Kaggle
Audio Spectrogram Transformer (fine-tuned on AudioSet)
Audio Spectrogram Transformer (AST) model fine-tuned on AudioSet. It was introduced in the paper AST: Audio Spectrogram Transformer by Gong et al. and first released in this repository.
Disclaimer: The team releasing Audio Spectrogram Transformer did not write a model card for this model so this model card has been written by the Hugging Face team.
Model description
The Audio Spectrogram Transformer is equivalent to ViT, but applied on audio. Audio is first turned into an image (as a spectrogram), after which a Vision Transformer is applied. The model gets state-of-the-art results on several audio classification benchmarks.
Usage
You can use the raw model for classifying audio into one of the AudioSet classes. See the documentation for more info.
- Downloads last month
- 765