Codesota · OCR · Benchmarks · demon-benchHome/OCR/Benchmarks/demon-bench
Unknown

demon-bench.

OCR benchmark

§ 01 · multi-image-reasoning

multi-image-reasoning.

Higher is better

#ModelScoreSource
Cheetah (Vicuna-13B)
Fetched from CodeSOTA API on 2026-04-20
53.65codesota-api
2
Cheetah (Vicuna-7B)
Fetched from CodeSOTA API on 2026-04-20
50.28codesota-api
3
Cheetah (LLaMA2-7B)
Fetched from CodeSOTA API on 2026-04-20
48.68codesota-api
4
InstructBLIP
Fetched from CodeSOTA API on 2026-04-20
48.55codesota-api
5
LLaMA-Adapter V2
Fetched from CodeSOTA API on 2026-04-20
44.03codesota-api
6
Otter
Fetched from CodeSOTA API on 2026-04-20
43.85codesota-api
7
MiniGPT-4
Fetched from CodeSOTA API on 2026-04-20
43.5codesota-api
8
mPLUG-Owl
Fetched from CodeSOTA API on 2026-04-20
42.5codesota-api
9
OpenFlamingo
Fetched from CodeSOTA API on 2026-04-20
41.63codesota-api
10
LLaVA
Fetched from CodeSOTA API on 2026-04-20
41.53codesota-api
11
BLIP-2
Fetched from CodeSOTA API on 2026-04-20
39.65codesota-api
§ 02 · grounded-qa

grounded-qa.

Higher is better

#ModelScoreSource
Cheetah (Vicuna-13B)
Fetched from CodeSOTA API on 2026-04-20
52.93codesota-api
2
Cheetah (LLaMA2-7B)
Fetched from CodeSOTA API on 2026-04-20
51codesota-api
3
Cheetah (Vicuna-7B)
Fetched from CodeSOTA API on 2026-04-20
48.6codesota-api
4
InstructBLIP
Fetched from CodeSOTA API on 2026-04-20
47.4codesota-api
5
LLaMA-Adapter V2
Fetched from CodeSOTA API on 2026-04-20
44.8codesota-api
6
Otter
Fetched from CodeSOTA API on 2026-04-20
41.67codesota-api
7
BLIP-2
Fetched from CodeSOTA API on 2026-04-20
39.23codesota-api
8
LLaVA
Fetched from CodeSOTA API on 2026-04-20
36.2codesota-api
9
mPLUG-Owl
Fetched from CodeSOTA API on 2026-04-20
33.27codesota-api
10
OpenFlamingo
Fetched from CodeSOTA API on 2026-04-20
32codesota-api
11
MiniGPT-4
Fetched from CodeSOTA API on 2026-04-20
30.27codesota-api
§ 03 · knowledge-images-qa

knowledge-images-qa.

Higher is better

#ModelScoreSource
Cheetah (Vicuna-13B)
Fetched from CodeSOTA API on 2026-04-20
49.33codesota-api
2
Cheetah (Vicuna-7B)
Fetched from CodeSOTA API on 2026-04-20
44.93codesota-api
3
Cheetah (LLaMA2-7B)
Fetched from CodeSOTA API on 2026-04-20
44.93codesota-api
4
InstructBLIP
Fetched from CodeSOTA API on 2026-04-20
44.4codesota-api
5
BLIP-2
Fetched from CodeSOTA API on 2026-04-20
33.53codesota-api
6
mPLUG-Owl
Fetched from CodeSOTA API on 2026-04-20
32.47codesota-api
7
LLaMA-Adapter V2
Fetched from CodeSOTA API on 2026-04-20
32codesota-api
8
OpenFlamingo
Fetched from CodeSOTA API on 2026-04-20
30.6codesota-api
9
LLaVA
Fetched from CodeSOTA API on 2026-04-20
28.33codesota-api
10
Otter
Fetched from CodeSOTA API on 2026-04-20
27.73codesota-api
11
MiniGPT-4
Fetched from CodeSOTA API on 2026-04-20
26.4codesota-api
§ 04 · multimodal-dialogue

multimodal-dialogue.

Higher is better

#ModelScoreSource
Cheetah (LLaMA2-7B)
Fetched from CodeSOTA API on 2026-04-20
42.7codesota-api
2
Cheetah (Vicuna-13B)
Fetched from CodeSOTA API on 2026-04-20
38.14codesota-api
3
Cheetah (Vicuna-7B)
Fetched from CodeSOTA API on 2026-04-20
37.5codesota-api
4
InstructBLIP
Fetched from CodeSOTA API on 2026-04-20
33.58codesota-api
5
BLIP-2
Fetched from CodeSOTA API on 2026-04-20
26.12codesota-api
6
OpenFlamingo
Fetched from CodeSOTA API on 2026-04-20
16.88codesota-api
7
Otter
Fetched from CodeSOTA API on 2026-04-20
15.37codesota-api
8
LLaMA-Adapter V2
Fetched from CodeSOTA API on 2026-04-20
14.22codesota-api
9
MiniGPT-4
Fetched from CodeSOTA API on 2026-04-20
13.69codesota-api
10
mPLUG-Owl
Fetched from CodeSOTA API on 2026-04-20
12.67codesota-api
11
LLaVA
Fetched from CodeSOTA API on 2026-04-20
7.79codesota-api
§ 05 · accuracy

accuracy.

Higher is better

#ModelScoreSource
Cheetah (Vicuna-13B)
Fetched from CodeSOTA API on 2026-04-20
39.28codesota-api
2
Cheetah (LLaMA2-7B)
Fetched from CodeSOTA API on 2026-04-20
37.22codesota-api
3
Cheetah (Vicuna-7B)
Fetched from CodeSOTA API on 2026-04-20
36.37codesota-api
4
InstructBLIP
Fetched from CodeSOTA API on 2026-04-20
33codesota-api
5
BLIP-2
Fetched from CodeSOTA API on 2026-04-20
26.92codesota-api
6
LLaMA-Adapter V2
Fetched from CodeSOTA API on 2026-04-20
26.3codesota-api
7
OpenFlamingo
Fetched from CodeSOTA API on 2026-04-20
25.83codesota-api
8
Otter
Fetched from CodeSOTA API on 2026-04-20
24.51codesota-api
9
mPLUG-Owl
Fetched from CodeSOTA API on 2026-04-20
23.13codesota-api
10
MiniGPT-4
Fetched from CodeSOTA API on 2026-04-20
22.21codesota-api
11
LLaVA
Fetched from CodeSOTA API on 2026-04-20
21.24codesota-api
§ 06 · visual-inference

visual-inference.

Higher is better

#ModelScoreSource
Cheetah (Vicuna-13B)
Fetched from CodeSOTA API on 2026-04-20
27.15codesota-api
2
Cheetah (Vicuna-7B)
Fetched from CodeSOTA API on 2026-04-20
25.9codesota-api
3
Cheetah (LLaMA2-7B)
Fetched from CodeSOTA API on 2026-04-20
25.5codesota-api
4
OpenFlamingo
Fetched from CodeSOTA API on 2026-04-20
13.85codesota-api
5
LLaMA-Adapter V2
Fetched from CodeSOTA API on 2026-04-20
13.51codesota-api
6
InstructBLIP
Fetched from CodeSOTA API on 2026-04-20
11.49codesota-api
7
Otter
Fetched from CodeSOTA API on 2026-04-20
11.39codesota-api
8
BLIP-2
Fetched from CodeSOTA API on 2026-04-20
10.67codesota-api
9
LLaVA
Fetched from CodeSOTA API on 2026-04-20
8.27codesota-api
10
MiniGPT-4
Fetched from CodeSOTA API on 2026-04-20
7.95codesota-api
11
mPLUG-Owl
Fetched from CodeSOTA API on 2026-04-20
5.40codesota-api
§ 07 · relation-cloze

relation-cloze.

Higher is better

#ModelScoreSource
Cheetah (Vicuna-13B)
Fetched from CodeSOTA API on 2026-04-20
27.15codesota-api
2
Cheetah (LLaMA2-7B)
Fetched from CodeSOTA API on 2026-04-20
22.95codesota-api
3
Cheetah (Vicuna-7B)
Fetched from CodeSOTA API on 2026-04-20
22.15codesota-api
4
OpenFlamingo
Fetched from CodeSOTA API on 2026-04-20
21.65codesota-api
5
InstructBLIP
Fetched from CodeSOTA API on 2026-04-20
21.2codesota-api
6
LLaMA-Adapter V2
Fetched from CodeSOTA API on 2026-04-20
18codesota-api
7
BLIP-2
Fetched from CodeSOTA API on 2026-04-20
17.94codesota-api
8
MiniGPT-4
Fetched from CodeSOTA API on 2026-04-20
16.6codesota-api
9
mPLUG-Owl
Fetched from CodeSOTA API on 2026-04-20
16.25codesota-api
10
Otter
Fetched from CodeSOTA API on 2026-04-20
16codesota-api
11
LLaVA
Fetched from CodeSOTA API on 2026-04-20
15.85codesota-api
§ 08 · storytelling

storytelling.

Higher is better

#ModelScoreSource
Cheetah (Vicuna-13B)
Fetched from CodeSOTA API on 2026-04-20
26.59codesota-api
2
Cheetah (Vicuna-7B)
Fetched from CodeSOTA API on 2026-04-20
25.2codesota-api
3
Cheetah (LLaMA2-7B)
Fetched from CodeSOTA API on 2026-04-20
24.76codesota-api
4
InstructBLIP
Fetched from CodeSOTA API on 2026-04-20
24.41codesota-api
5
OpenFlamingo
Fetched from CodeSOTA API on 2026-04-20
24.22codesota-api
6
BLIP-2
Fetched from CodeSOTA API on 2026-04-20
21.31codesota-api
7
mPLUG-Owl
Fetched from CodeSOTA API on 2026-04-20
19.33codesota-api
8
LLaMA-Adapter V2
Fetched from CodeSOTA API on 2026-04-20
17.57codesota-api
9
MiniGPT-4
Fetched from CodeSOTA API on 2026-04-20
17.07codesota-api
10
Otter
Fetched from CodeSOTA API on 2026-04-20
15.57codesota-api
11
LLaVA
Fetched from CodeSOTA API on 2026-04-20
10.7codesota-api
§ Related · Explore

More OCR content.

Verified Model Reviews
Comparisons & Guides
View all OCR benchmarks → Back to All Benchmarks