The best reranker is the one that wins your second-stage retrieval test.

Use the live MTEB reranking leaderboard to find current SOTA candidates, then test them on your own corpus. Public MTEB reranking is useful for shortlisting, but it should not be treated as a guarantee that one model is the best reranker for every RAG, search, or question-answering system.

See the MTEB overview /benchmarks/mteb Open live MTEB leaderboard

Reranking vs embeddings

They solve adjacent parts of the retrieval stack. Confusing them leads to slow systems or weak relevance.

Axis	Embeddings	Reranking
Primary job	Create vectors for broad candidate retrieval.	Re-score a small candidate set for final ordering.
Input shape	Usually one text at a time: query or document.	Query-document pairs, often scored one pair at a time or in batches.
Where it fits	Stage 1 retrieval, clustering, deduplication, semantic search.	Stage 2 ranking for RAG, search, QA, and recommendation candidates.
Main tradeoff	Fast and scalable, but may miss subtle relevance ordering.	More precise ordering, but higher latency and compute per candidate.

Decision guide

Use this before asking which model is SOTA. The right reranker depends on the retrieval architecture.

Situation	Choose	Why
You need the best possible answer order	Use a strong reranker after first-stage embedding retrieval.	Rerankers compare the query with each candidate passage directly, so they usually improve final ordering when the candidate set is already relevant.
You need low-latency search over millions of documents	Use embeddings first, then rerank only the top 20-200 candidates.	Embedding search is built for fast approximate nearest-neighbor lookup; reranking every document is normally too expensive.
You serve regulated or private workloads	Prefer an auditable open-weight reranker or a private deployment path.	MTEB does not answer data-governance, retention, logging, or deployment-control questions.
You handle long technical documents	Check context length, chunking behavior, and domain evaluation before trusting a rank.	A high reranking score can hide practical limits around long passages, tables, citations, or code-heavy text.

The MTEB reranking caveat

MTEB reranking scores are benchmark evidence, not production truth. They compress many datasets into public comparisons, but they cannot know your corpus, your query distribution, your chunking, your languages, or your latency ceiling.

Because public leaderboards change, this page avoids precise current scores. For exact ranks, use the live MTEB leaderboard and the broader CodeSOTA MTEB overview.

Practical shortlist rules

Treat the MTEB reranking leaderboard as a shortlist generator, not a final procurement answer.
Validate on your own query logs, negative examples, document lengths, languages, and latency budget.
Compare a pure embedding baseline against embedding plus reranking; the lift matters more than the absolute public rank.
Prefer qualitative robustness over tiny leaderboard gaps when scores are close or when exact current numbers are not verified.

MTEB leaderboard overview Feature extraction and text embeddings Implementation guides