Multilingual for Translation Corpus Helsinki-NLP/opus_books Viewer • Updated Mar 29, 2024 • 1.25M • 14.7k • 85
models rasa/LaBSE Feature Extraction • Updated May 20, 2021 • 7.71k • • 22 nomic-ai/nomic-embed-text-v1.5 Sentence Similarity • 0.1B • Updated Jul 21, 2025 • 3.26M • 749 NovaSearch/stella_en_1.5B_v5 Sentence Similarity • 2B • Updated Jul 28, 2025 • 40k • 258 llmware/llama-3.2-1b-gguf 1B • Updated Feb 8, 2025 • 27 • 1
Vietnamese ngtoanrob/vien-translation Translation • Updated Feb 24, 2023 • 122 • 1 ngtoanrob/envi-translation Updated Apr 1, 2023 • 4 • 1 gozu888/Envit5-tuned Translation • 0.3B • Updated Jun 28, 2023 • 19 • 3 IWSLT/mt_eng_vietnamese Updated Jan 18, 2024 • 354 • 29
Wish list HuggingFaceH4/ultrachat_200k Viewer • Updated Oct 16, 2024 • 515k • 26.5k • 633 bookcorpus/bookcorpus Updated May 3, 2024 • 5.76k • 339 sentence-transformers/wikipedia-en-sentences Viewer • Updated Apr 25, 2024 • 7.87M • 177 • 7 sentence-transformers/paq Viewer • Updated May 1, 2024 • 64.4M • 453 • 2
LLMs TheBloke/Llama-2-13B-chat-GGML Text Generation • Updated Sep 27, 2023 • 112 • 696 TheBloke/Llama-2-7B-32K-Instruct-GGML Updated Sep 27, 2023 • 8 • 8 openchat/openchat-3.6-8b-20240522 Text Generation • 8B • Updated May 28, 2024 • 7.89k • • 156
corpuses Skylion007/openwebtext Viewer • Updated 10 days ago • 8.01M • 43.8k • 475 humarin/chatgpt-paraphrases Viewer • Updated Apr 5, 2023 • 419k • 245 • 59 stanford-oval/ccnews Viewer • Updated Aug 31, 2024 • 893M • 6.2k • 32 stanford-oval/wikipedia Viewer • Updated Apr 29, 2025 • 345M • 5.37k • 12
Multilingual for Translation Corpus Helsinki-NLP/opus_books Viewer • Updated Mar 29, 2024 • 1.25M • 14.7k • 85
Wish list HuggingFaceH4/ultrachat_200k Viewer • Updated Oct 16, 2024 • 515k • 26.5k • 633 bookcorpus/bookcorpus Updated May 3, 2024 • 5.76k • 339 sentence-transformers/wikipedia-en-sentences Viewer • Updated Apr 25, 2024 • 7.87M • 177 • 7 sentence-transformers/paq Viewer • Updated May 1, 2024 • 64.4M • 453 • 2
models rasa/LaBSE Feature Extraction • Updated May 20, 2021 • 7.71k • • 22 nomic-ai/nomic-embed-text-v1.5 Sentence Similarity • 0.1B • Updated Jul 21, 2025 • 3.26M • 749 NovaSearch/stella_en_1.5B_v5 Sentence Similarity • 2B • Updated Jul 28, 2025 • 40k • 258 llmware/llama-3.2-1b-gguf 1B • Updated Feb 8, 2025 • 27 • 1
LLMs TheBloke/Llama-2-13B-chat-GGML Text Generation • Updated Sep 27, 2023 • 112 • 696 TheBloke/Llama-2-7B-32K-Instruct-GGML Updated Sep 27, 2023 • 8 • 8 openchat/openchat-3.6-8b-20240522 Text Generation • 8B • Updated May 28, 2024 • 7.89k • • 156
Vietnamese ngtoanrob/vien-translation Translation • Updated Feb 24, 2023 • 122 • 1 ngtoanrob/envi-translation Updated Apr 1, 2023 • 4 • 1 gozu888/Envit5-tuned Translation • 0.3B • Updated Jun 28, 2023 • 19 • 3 IWSLT/mt_eng_vietnamese Updated Jan 18, 2024 • 354 • 29
corpuses Skylion007/openwebtext Viewer • Updated 10 days ago • 8.01M • 43.8k • 475 humarin/chatgpt-paraphrases Viewer • Updated Apr 5, 2023 • 419k • 245 • 59 stanford-oval/ccnews Viewer • Updated Aug 31, 2024 • 893M • 6.2k • 32 stanford-oval/wikipedia Viewer • Updated Apr 29, 2025 • 345M • 5.37k • 12