Codesota · Benchmark · WikiTableQuestionsHome/Leaderboards/WikiTableQuestions
Unknown

WikiTableQuestions.

Question answering over Wikipedia tables requiring compositional reasoning

Paper Leaderboard
§ 01 · SOTA history

Year over year.

§ 02 · Leaderboard

Results by metric.

Only 3 models on this benchmark
Help build the community leaderboard — submit your model results.
Found a wrong score or missing run?
Use row edits to send a sourced correction into moderation.
Add / edit result Report issue

Accuracy

Accuracy is the reported evaluation metric for WikiTableQuestions. Codesota tracks published model scores on this metric so readers can compare state-of-the-art results across sources and model families.

Higher is better

Trust tiers for Accuracyverifiedpapervendorcommunityunverified
RankModelTrustScoreYearLinksFix
01GPT-4
GPT-4 test accuracy on WikiTableQuestions. Best prior result cited in ReAcTable (VLDB 2024).
verified75.32024Source ↗Looks wrong?
02Claude 3.5 Sonnet
Claude 3.5 Sonnet accuracy on WikiTableQuestions. Reported in Accurate TableQA paper (arXiv 2601.03137).
verified732025Source ↗Looks wrong?
03TAPAS-large
TAPAS-large accuracy on WikiTableQuestions test set. Transfer from WikiSQL. Original TAPAS paper Table 5.
verified48.72020Paper ↗Looks wrong?
§ 04 · Submit a result

Add to the leaderboard.

← Back to Leaderboards