pplx-embed Collection Diffusion-Pretrained Dense and Contextual Embeddings • 7 items • Updated Feb 26 • 95
view article Article ATE-2: State-of-the-Art Armenian Text Embeddings and the ArmBench-TextEmbed Benchmark 10 days ago • 8
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 17 days ago • 64
zELO: ELO-inspired Training Method for Rerankers and Embedding Models Paper • 2509.12541 • Published Sep 16, 2025 • 9
Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math Paper • 2602.06291 • Published Feb 6 • 23
Embedding Model Datasets Collection A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 70 items • Updated Dec 10, 2025 • 164
NanoBEIR datasets Collection These datasets are compatible with the (Sparse)NanoBEIREvaluator with Sentence Transformers v5.2+. Also CrossEncoderNanoBEIREvaluator if bm25 column • 16 items • Updated 27 days ago • 14
Llama Nemoretriever Colembed: Top-Performing Text-Image Retrieval Model Paper • 2507.05513 • Published Jul 7, 2025 • 1
KoViDoRe Benchmark (BEIR) v2 Collection Korean Vision Document Retrieval Benchmark • 4 items • Updated 27 days ago • 5
view article Article Nano-BEIR: A Multilingual Information Retrieval Benchmark with Quality-Enhanced Queries Dec 22, 2025 • 9
KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model Paper • 2501.01028 • Published Jan 2, 2025 • 19