Canonical multiple-choice reading comprehension benchmark built from English exams for Chinese middle and high school students. ~28K passages and ~100K questions. Evaluated as accuracy over RACE-M (middle) + RACE-H (high) combined.
Accuracy is the reported evaluation metric for RACE. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.
Higher is better
| Rank | Model | Trust | Score | Year | Links | Edit |
|---|---|---|---|---|---|---|
| 01 | Megatron-BERT | paper | 90.9 | 2026 | Source ↗ | Edit result |
| 02 | ALBERT (Ensemble) | paper | 89.4 | 2026 | Source ↗ | Edit result |
| 03 | ALBERT ensemble | unverified | 89.4 | 2019 | Paper ↗Code ↗ | Edit result |
| 04 | RoBERTa | unverified | 83.2 | 2019 | Paper ↗Code ↗ | Edit result |