Computer Vision
Building systems that understand images and video? Find benchmarks for recognition, detection, segmentation, and document analysis tasks.
Tasks in Computer Vision
Scene Text Detection
Detecting text regions in natural scene images.
Document OCR
Converting scanned documents and images into machine-readable text.
Handwriting Recognition
Recognizing handwritten text from images.
Document Understanding
Extracting semantic information and structure from documents (VDU).
Document Parsing
Converting documents (like PDFs) into structured formats (Markdown/HTML).
General OCR Capabilities
Comprehensive benchmarks covering multiple aspects of OCR performance.
Polish OCR
OCR for Polish language including historical documents, gothic fonts, and diacritic recognition.
Image Classification
Categorizing images into predefined classes (ImageNet, CIFAR).
Object Detection
Locating and classifying objects in images (COCO, Pascal VOC).
Semantic Segmentation
Pixel-level classification of images (Cityscapes, ADE20K).
Explore Other Areas
Natural Language Processing
Processing and understanding text? Evaluate your models on language understanding, generation, translation, and information extraction benchmarks.
Reasoning
Testing if your model can think logically? Benchmark math problem solving, commonsense understanding, and multi-step reasoning capabilities.
Computer Code
Developing AI coding assistants? Test code generation, completion, translation, bug detection, and repair capabilities.
Speech
Working with voice and audio? Evaluate speech-to-text accuracy, voice synthesis quality, and speaker identification performance.