Improve model card for Vision-SR1

#1
by nielsr HF Staff - opened

This PR significantly improves the model card for Vision-SR1 by:

  • Adding pipeline_tag: image-text-to-text to the metadata, which enhances discoverability at https://huggingface.co/models?pipeline_tag=image-text-to-text.
  • Including library_name: transformers in the metadata, based on explicit evidence in config.json and the GitHub README, enabling the automated "Use in Transformers" widget.
  • Incorporating the full paper abstract to provide a comprehensive overview of the model's methodology and capabilities.
  • Linking to the official Hugging Face paper page: Self-Rewarding Vision-Language Model via Reasoning Decomposition.
  • Providing a direct link to the official GitHub repository for code and further details: https://github.com/zli12321/Vision-SR1.
  • Populating the main content with a detailed description adapted from the GitHub README, including information about the model, datasets, and related Hugging Face artifacts.
  • Including the recommended citation for the codebase.

Please review and merge this PR if these changes align with your expectations.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment