Heterogeneous information retrieval benchmark across 18 datasets
5 results indexed across 2 metrics. Shaded row marks current SOTA; ties broken by submission date.
| # | Model | Org | Submitted | Paper / code | ndcg-10 |
|---|---|---|---|---|---|
| 01 | ModernBERT (large) | — | Dec 2024 | Smarter, Better, Faster, Longer: A Modern Bidirectional … · code | 44 |
| # | Model | Org | Submitted | Paper / code | ndcg@10 |
|---|---|---|---|---|---|
| 01 | NV-Embed-v2Open | NVIDIA | Sep 2024 | NV-Embed: Improved Techniques for Training LLMs as Gener… | 62.65 |
| 02 | GTE-Qwen2-7B-instructOpen | Alibaba | Jun 2024 | arxiv | 60.25 |
| 03 | E5-Mistral-7B-instructOpen | Microsoft | Jan 2024 | Improving Text Embeddings with Large Language Models | 56.90 |
| 04 | ColBERTv2Open | Stanford | Jul 2022 | ColBERTv2: Effective and Efficient Retrieval via Lightwe… | 49.40 |
Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.
Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.