Semantic textual similarity with human-annotated sentence pairs
Spearman is the reported evaluation metric for STS Benchmark. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Fix |
|---|---|---|---|---|---|---|
| 01 | GTE-Qwen2-7B-instruct | verified | 88.4 | 2024 | Source ↗ | Looks wrong? |
| 02 | E5-Mistral-7B-instruct | verified | 84.7 | 2024 | Source ↗ | Looks wrong? |
| 03 | all-MiniLM-L6-v2 | verified | 82.8 | 2022 | Source ↗ | Looks wrong? |