Optical Character Recognition2024ar

KITAB Arabic OCR Benchmark

8,809 Arabic text samples across 9 domains. Tests Arabic script recognition.

Samples:8,809
Metrics:cer
Paper / Website
Current State of the Art

Gemini 2.0 Flash

Google

0.130

cer

Top Models Performance Comparison

Top 8 models ranked by cer (lower is better)

cer1Gemini 2.0 Flash0.130100.0%2AIN 7B0.20065.0%3GPT-4o0.31041.9%4GPT-4o Mini0.43030.2%5Azure OCR0.52025.0%6Tesseract0.54024.1%7EasyOCR0.58022.4%8PaddleOCR0.79016.5%0%25%50%75%100%% of best
Best Score
0.130
Top Model
Gemini 2.0 Flash
Models Compared
8
Score Range
0.660

cerPrimary

#ModelScorePaper / CodeDate
1
Gemini 2.0 FlashAPI
Google
0.130Dec 2025
2
AIN 7BOpen Source
Research
0.200Dec 2025
3
GPT-4oAPI
OpenAI
0.310Dec 2025
4
GPT-4o Mini
OpenAI
0.430Dec 2025
5
Azure OCR
Microsoft
0.520Dec 2025
6
TesseractOpen Source
Google (Open Source)
0.540Dec 2025
7
EasyOCROpen Source
JaidedAI
0.580Dec 2025
8
PaddleOCROpen Source
Baidu
0.790Dec 2025

Other Optical Character Recognition Datasets

KITAB-Bench Benchmark - Optical Character Recognition | CodeSOTA