Codesota · OCR · Benchmarks · terminal-bench-2Home/OCR/Benchmarks/terminal-bench-2
Unknown

terminal-bench-2.

OCR benchmark

§ 01 · accuracy

accuracy.

Higher is better

#ModelScoreSource
Codex / GPT-5.5
Rank 1 on terminal-bench@2.0. Agent: Codex. Model: GPT-5.5. Date: 2026-04-23. Official leaderboard reports 82.0% +/- 2.2.
82terminal-bench-official
2
ForgeCode / GPT-5.4
Rank 2 on terminal-bench@2.0. Agent org: ForgeCode. Model org: OpenAI. Date: 2026-03-12. Official leaderboard reports 81.8% +/- 2.0.
81.8terminal-bench-official
3
TongAgents / Gemini 3.1 Pro
Rank 3 on terminal-bench@2.0. Agent org: BIGAI. Model org: Google. Date: 2026-03-13. Official leaderboard reports 80.2% +/- 2.6.
80.2terminal-bench-official
4
ForgeCode / Claude Opus 4.6
Rank 4 on terminal-bench@2.0. Agent org: ForgeCode. Model org: Anthropic. Date: 2026-03-12. Official leaderboard reports 79.8% +/- 1.6.
79.8terminal-bench-official
5
SageAgent / GPT-5.3-Codex
Rank 5 on terminal-bench@2.0. Agent org: OpenSage. Model org: OpenAI. Date: 2026-03-13. Official leaderboard reports 78.4% +/- 2.2.
78.4terminal-bench-official
6
ForgeCode / Gemini 3.1 Pro
Rank 6 on terminal-bench@2.0. Agent org: ForgeCode. Model org: Google. Date: 2026-03-02. Official leaderboard reports 78.4% +/- 1.8.
78.4terminal-bench-official
7
Droid / GPT-5.3-Codex
Rank 7 on terminal-bench@2.0. Agent org: Factory. Model org: OpenAI. Date: 2026-02-24. Official leaderboard reports 77.3% +/- 2.2.
77.3terminal-bench-official
8
Capy / Claude Opus 4.6
Rank 8 on terminal-bench@2.0. Agent org: Capy. Model org: Anthropic. Date: 2026-03-12. Official leaderboard reports 75.3% +/- 2.4.
75.3terminal-bench-official
9
Simple Codex / GPT-5.3-Codex
Rank 9 on terminal-bench@2.0. Agent org: OpenAI. Model org: OpenAI. Date: 2026-02-06. Official leaderboard reports 75.1% +/- 2.4.
75.1terminal-bench-official
10
Terminus-KIRA / Gemini 3.1 Pro
Rank 10 on terminal-bench@2.0. Agent org: KRAFTON AI. Model org: Google. Date: 2026-02-23. Official leaderboard reports 74.8% +/- 2.6.
74.8terminal-bench-official
11
Terminus-KIRA / Claude Opus 4.6
Rank 11 on terminal-bench@2.0. Agent org: KRAFTON AI. Model org: Anthropic. Date: 2026-02-22. Official leaderboard reports 74.7% +/- 2.6.
74.7terminal-bench-official
12
Mux / GPT-5.3-Codex
Rank 12 on terminal-bench@2.0. Agent org: Coder. Model org: OpenAI. Date: 2026-03-06. Official leaderboard reports 74.6% +/- 2.5.
74.6terminal-bench-official
13
MAYA-V2 / Claude 4.6 Opus
Rank 13 on terminal-bench@2.0. Agent org: ADYA. Model org: Anthropic. Date: 2026-03-12. Official leaderboard reports 72.1% +/- 2.2.
72.1terminal-bench-official
14
TongAgents / Claude Opus 4.6
Rank 14 on terminal-bench@2.0. Agent org: Bigai. Model org: Anthropic. Date: 2026-02-22. Official leaderboard reports 71.9% +/- 2.7.
71.9terminal-bench-official
15
Junie CLI / Multiple
Rank 15 on terminal-bench@2.0. Agent org: JetBrains. Model org: Multiple. Date: 2026-03-07. Official leaderboard reports 71.0% +/- 2.9.
71terminal-bench-official
16
CodeBrain-1 / GPT-5.3-Codex
Rank 16 on terminal-bench@2.0. Agent org: Feeling AI. Model org: OpenAI. Date: 2026-02-10. Official leaderboard reports 70.3% +/- 2.6.
70.3terminal-bench-official
17
Droid / Claude Opus 4.6
Rank 17 on terminal-bench@2.0. Agent org: Factory. Model org: Anthropic. Date: 2026-02-05. Official leaderboard reports 69.9% +/- 2.5.
69.9terminal-bench-official
18
Ante / Gemini 3 Pro
Rank 18 on terminal-bench@2.0. Agent org: Antigma Labs. Model org: Google. Date: 2026-01-06. Official leaderboard reports 69.4% +/- 2.1.
69.4terminal-bench-official
19
IndusAGI Coding Agent / GPT-5.3-Codex
Rank 19 on terminal-bench@2.0. Agent org: Varun Israni (SoloVpx). Model org: OpenAI. Date: 2026-03-18. Official leaderboard reports 69.1% +/- 2.3.
69.1terminal-bench-official
20
Crux / Claude Opus 4.6
Rank 20 on terminal-bench@2.0. Agent org: Roam. Model org: Anthropic. Date: 2026-02-23. Official leaderboard reports 66.9% +/- N/A.
66.9terminal-bench-official
§ Related · Explore

More OCR content.

Verified Model Reviews
Comparisons & Guides
View all OCR benchmarks → Back to All Benchmarks