Codesota · OCR · Benchmarks · swe-bench-verifiedHome/OCR/Benchmarks/swe-bench-verified
Unknown

swe-bench-verified.

OCR benchmark

§ 01 · resolve-rate

resolve-rate.

Higher is better

#ModelScoreSource
Claude Opus 4.7
Claude Code harness · Anthropic primary announcement
87.6vendor
2
Claude Opus 4.5
Non-API entry from src
80.9src
3
Claude Opus 4.6
Non-API entry from src
80.8src
4
Gemini 3.1 Pro
Non-API entry from src
80.6src
5
MiniMax M2.5
Non-API entry from src
80.2src
6
GPT-5.2 Thinking
Non-API entry from src
80src
7
Claude Sonnet 4.6
Non-API entry from src
79.6src
8
Gemini 3 Flash
Non-API entry from src
78src
9
Claude Sonnet 4.5
Non-API entry from src
77.2src
10
Kimi K2.5
Non-API entry from src
76.8src
11
GPT-5.1
Non-API entry from src
76.3src
12
Gemini 3 Pro
Non-API entry from src
76.2src
13
GPT-5
Non-API entry from src
74.9src
14
MiniMax M2.1
Non-API entry from src
74src
15
Claude Haiku 4.5
Non-API entry from src
73.3src
16
Claude Sonnet 4
Non-API entry from src
72.7src
17
Claude Opus 4
Non-API entry from src
72.5src
18
Devstral 2
Non-API entry from src
72.2src
19
Qwen3-Coder-480B
Non-API entry from src
69.6src
20
MiniMax M2
Non-API entry from src
69.4src
21
o3
Non-API entry from src
69.1src
22
o4-mini
Non-API entry from src
68.1src
23
DeepSeek V3.1
Non-API entry from src
66src
24
Kimi K2
Non-API entry from src
65.8src
25
Grok 3
Non-API entry from src
63.8src
26
Gemini 2.5 Pro
Non-API entry from src
63.8src
27
Claude 3.7 Sonnet
Non-API entry from src
63.7src
28
Gemini 2.5 Flash
Non-API entry from src
60.4src
29
DeepSeek R1-0528
Non-API entry from src
57.6src
30
o3-mini
Non-API entry from src
55.8src
31
GPT-4.1
Non-API entry from src
54.6src
32
Claude 3.5 Sonnet
Non-API entry from src
50.8src
33
DeepSeek-R1
Non-API entry from src
49.2src
34
o1
Non-API entry from src
48.9src
35
Devstral Small 2505
Non-API entry from src
46.8src
36
DeepSeek V3
Non-API entry from src
42src
37
GPT-4o
Non-API entry from src
41.2src
38
Claude 3.5 Haiku
Non-API entry from src
40.6src
39
DeepSeek V2.5
Non-API entry from src
37src
§ Related · Explore

More OCR content.

Verified Model Reviews
Comparisons & Guides
View all OCR benchmarks → Back to All Benchmarks