Semantic Textual Similarity2017en

STS Benchmark

Semantic textual similarity with human-annotated sentence pairs

Current State of the Art

GTE-Qwen2-7B-instruct

Alibaba

88.4

spearman

spearman Progress Over Time

Showing 3 breakthroughs from Jan 2022 to Jun 2024

82.283.985.687.389.0Jan 2022Mar 2023Jun 2024spearmanDate

Key Milestones

Jan 2022
all-MiniLM-L6-v2

all-MiniLM-L6-v2 Spearman on STS Benchmark test. From official model card.

82.8
Jan 2024
E5-Mistral-7B-instruct

E5-Mistral-7B Spearman on STS Benchmark. From MTEB STS sub-task results.

84.7
+2.3%
Jun 2024
GTE-Qwen2-7B-instructCurrent SOTA

GTE-Qwen2-7B Spearman on STS Benchmark test split. MTEB STS sub-task average.

88.4
+4.4%
Total Improvement
6.8%
Time Span
2y 6m
Breakthroughs
3
Current SOTA
88.4

Top Models Performance Comparison

Top 3 models ranked by spearman

spearman1GTE-Qwen2-7B-instruct88.4100.0%2E5-Mistral-7B-instruct84.795.8%3all-MiniLM-L6-v282.893.7%0%25%50%75%100%% of best
Best Score
88.4
Top Model
GTE-Qwen2-7B-inst...
Models Compared
3
Score Range
5.6

spearmanPrimary

#ModelScorePaper / CodeDate
1
GTE-Qwen2-7B-instructOpen Source
Alibaba
88.4Jun 2024
2
E5-Mistral-7B-instructOpen Source
Microsoft
84.7Jan 2024
3
all-MiniLM-L6-v2Open Source
Sentence-Transformers
82.8Jan 2022

Related Papers1

Improving Text Embeddings with Large Language Models
Jan 2024Models: E5-Mistral-7B-instruct