How many models are tracked on tabfact?

Codesota tracks 21 models on tabfact across 2 metrics.

When was the tabfact leaderboard last updated?

The tabfact leaderboard on Codesota includes results through 2026, with the earliest tracked result from 2019.

Codesota · Computer Vision · Optical Character Recognition · tabfactTasks/Computer Vision/Optical Character Recognition

Optical Character Recognition · benchmark dataset · 2020 · EN

tabfact.

Name: tabfact Benchmark Results
Creator: Codesota
Published: 2019-01-01
License: https://creativecommons.org/licenses/by/4.0/

Dataset from Papers With Code

Submit a result ↵

§ 01 · Leaderboard

Best published scores.

30 results indexed across 2 metrics. Shaded row marks current SOTA; ties broken by submission date.

Primary: accuracy · higher is better
All metrics: test, val

test

22 rows

#	Model	Org	Submitted	Paper / code	test
01	TabTracer	—	Feb 2026	arxiv	94.86
02	TableMaster	—	Jan 2025	arxiv	94.52
03	ARTEMIS-DA	—	Dec 2024	ARTEMIS-DA: An Advanced Reasoning and Transformation Eng…	93.10
04	Dater	—	Jan 2023	Large Language Models are Versatile Decomposers: Decompo… · code	93
05	STaR-8B	—	Nov 2025	arxiv	92.05
06	TableMaster	—	Jan 2025	arxiv-gpt4o-mini	90.12
07	PASTA	—	Nov 2022	PASTA: Table-Operations Aware Fact Verification via Sent… · code	89.30
08	T-REX (Phi-4)	—	Aug 2025	arxiv	89
09	PoTable	—	Dec 2024	arxiv	88.93
10	Chain-of-Table	—	Jan 2024	Chain-of-Table: Evolving Tables in the Reasoning Chain f… · code	86.61
11	Binder	—	Oct 2022	Binding Language Models in Symbolic Languages · code	86
12	Tab-PoT	—	Jun 2024	Efficient Prompting for LLM-based Generative Internet of…	85.77
13	ReasTAP-Large	—	Oct 2022	ReasTAP: Injecting Table Reasoning Skills During Pre-tra… · code	84.90
14	TAPEX-Large	—	Jul 2021	TAPEX: Table Pre-training via Learning a Neural SQL Exec… · code	84.20
15	RePanda	—	Mar 2025	arxiv	84.09
16	T5-3b(UnifiedSKG)	—	Jan 2022	UnifiedSKG: Unifying and Multi-Tasking Structured Knowle… · code	83.68
17	Salience-aware TAPAS	—	Sep 2021	Table-based Fact Verification with Salience-aware Learni… · code	82.10
18	TAPAS-Large classifier with Counterfactual + Synthetic pre-training	—	Oct 2020	Understanding tables with intermediate pre-training · code	81
19	TabSQLify (col+row)	—	Apr 2024	TabSQLify: Enhancing Reasoning Capabilities of LLMs Thro… · code	79.50
20	NormTab (Targeted) + SQL	—	Jun 2024	NormTab: Improving Symbolic Reasoning in LLMs Through Ta… · code	68.90
21	Table-BERT-Horizontal-T+F-Template	—	Sep 2019	TabFact: A Large-scale Dataset for Table-based Fact Veri… · code	65.12
22	BERT classifier w/o Table	—	Sep 2019	TabFact: A Large-scale Dataset for Table-based Fact Veri… · code	50.50

val

8 rows

#	Model	Org	Submitted	Paper / code	val
01	PASTA	—	Nov 2022	PASTA: Table-Operations Aware Fact Verification via Sent… · code	89.20
02	TAPEX-Large	—	Jul 2021	TAPEX: Table Pre-training via Learning a Neural SQL Exec… · code	84.60
03	ReasTAP-Large	—	Oct 2022	ReasTAP: Injecting Table Reasoning Skills During Pre-tra… · code	84.60
04	T5-3b(UnifiedSKG)	—	Jan 2022	UnifiedSKG: Unifying and Multi-Tasking Structured Knowle… · code	83.97
05	Salience-aware TAPAS	—	Sep 2021	Table-based Fact Verification with Salience-aware Learni… · code	82.70
06	TAPAS-Large classifier with Counterfactual + Synthetic pre-training	—	Oct 2020	Understanding tables with intermediate pre-training · code	81
07	Table-BERT-Horizontal-T+F-Template	—	Sep 2019	TabFact: A Large-scale Dataset for Table-based Fact Veri… · code	66.10
08	BERT classifier w/o Table	—	Sep 2019	TabFact: A Large-scale Dataset for Table-based Fact Veri… · code	50.90

Fig 2 · Rows sorted by score within each metric. Shaded row marks SOTA. Dates reflect model or paper release where available, otherwise the date Codesota accessed the source.

§ 04 · Literature

14 papers
tied to this benchmark.

Every paper below corresponds to at least one row in the leaderboard above. Click through for the arXiv preprint and, when available, the reference implementation.

ARTEMIS-DA: An Advanced Reasoning and Transformation Engine for Multi-Step Insight Synthesis in Data Analytics
Dec 2024·ARTEMIS-DA
arXiv ↗
NormTab: Improving Symbolic Reasoning in LLMs Through Tabular Data Normalization
Jun 2024·NormTab (Targeted) + SQL
arXiv ↗Code
Efficient Prompting for LLM-based Generative Internet of Things
Jun 2024·Tab-PoT
arXiv ↗
TabSQLify: Enhancing Reasoning Capabilities of LLMs Through Table Decomposition
Apr 2024·TabSQLify (col+row)
arXiv ↗Code
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding
Jan 2024·Chain-of-Table
arXiv ↗Code
Large Language Models are Versatile Decomposers: Decompose Evidence and Questions for Table-based Reasoning
Jan 2023·Dater
arXiv ↗Code
PASTA: Table-Operations Aware Fact Verification via Sentence-Table Cloze Pre-training
Nov 2022·PASTA
arXiv ↗Code
ReasTAP: Injecting Table Reasoning Skills During Pre-training via Synthetic Reasoning Examples
Oct 2022·ReasTAP-Large
arXiv ↗Code
Binding Language Models in Symbolic Languages
Oct 2022·Binder
arXiv ↗Code
UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models
Jan 2022·T5-3b(UnifiedSKG)
arXiv ↗Code
Table-based Fact Verification with Salience-aware Learning
Sep 2021·Salience-aware TAPAS
arXiv ↗Code
TAPEX: Table Pre-training via Learning a Neural SQL Executor
Jul 2021·TAPEX-Large
arXiv ↗Code
Understanding tables with intermediate pre-training
Oct 2020·TAPAS-Large classifier with Counterfactual + Synthetic pre-training
arXiv ↗Code
TabFact: A Large-scale Dataset for Table-based Fact Verification
Sep 2019·Table-BERT-Horizontal-T+F-Template, BERT classifier w/o Table
arXiv ↗Code

§ 06 · Contribute

Have a score that beats
this table?

Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.

Submit a result ↵Read submission guide

What a submission needs

01A public checkpoint or API endpoint
02A reproduction script with frozen commit + seed
03Declared evaluation environment (Python, deps)
04One row per metric declared by this dataset
05A contact so we can follow up on discrepancies

tabfact.

Best published scores.

14 paperstied to this benchmark.

Neighbouring benchmarks.

Have a score that beatsthis table?

14 papers
tied to this benchmark.

Have a score that beats
this table?