Codesota
.
Papers
Tasks
Benchmarks
Models
OCR
Submit score
Sign in
Submit score
↵
Menu
Codesota · OCR · Benchmarks · audiocaps
Home
/
OCR
/
Benchmarks
/
audiocaps
Unknown
audiocaps
.
OCR benchmark
§ 01 · fad
fad.
Higher is better
#
Model
Score
Source
★
AudioLDM
Fetched from CodeSOTA API on 2026-04-20
4.48
codesota-api
2
AudioLDM 2-Full-Large
Fetched from CodeSOTA API on 2026-04-20
1.86
codesota-api
3
AudioLDM 2-Full
Fetched from CodeSOTA API on 2026-04-20
1.78
codesota-api
4
TANGO
Fetched from CodeSOTA API on 2026-04-20
1.73
codesota-api
5
AudioLDM 2-AC-Large
Fetched from CodeSOTA API on 2026-04-20
1.42
codesota-api
§ Related · Explore
More OCR content.
Verified Model Reviews
dots.ocr 3B — 88.41 OmniDocBench, 100+ languages
Mistral OCR 3 — 79.75 composite, verified results
clearOCR — Traditional OCR solution
Rys OCR — Polish SOTA model
Comparisons & Guides
PaddleOCR vs Tesseract comparison
GPT-4o vs PaddleOCR comparison
Docling Tutorial: PDF to Markdown
All OCR Vendors Comparison
View all OCR benchmarks →
←
Back to All Benchmarks