Codesota · Benchmark · LivebenchHome/Leaderboards/Language & Knowledge/Language Modeling/Livebench
Unknown

Livebench.

The Livebench dataset is a time-series dataset related to language modeling. It gathers and processes data from the LiveBench website's GitHub repository and the files used by the live version of the website to ensure the data is up-to-date. The dataset includes information such as question IDs, categories (which are consistently "language"), and release dates for the data. It also contains counts associated with different date ranges and label ranges (e.g., 0.00 - 10.00, 10.00 - 20.00).

Paper Leaderboard
§ 01 · SOTA history

Year over year.

Not enough data to show trend.
§ 02 · Leaderboard

Results by metric.

Only 1 model on this benchmark
Help build the community leaderboard — submit your model results.

Accuracy

Accuracy is the reported evaluation metric for Livebench. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Accuracyverifiedpapervendorcommunityunverified
RankModelTrustScoreYearSource
01Qwen2.5-Plus
dataset: Livebench; task: 5
paper54.6N/ASource ↗
§ 04 · Submit a result

Add to the leaderboard.

← Back to Language Modeling