Codesota · Models1,357 models indexed · 896 match filter
Editorial · Models

Every model, measured.

Start with a research area, drill into a vendor, or page through the full index. Only models with at least one benchmark score appear — a model without a recorded score can’t be ranked.

Vendor:Areas overviewspeakleash · 253OpenAI · 85Google · 71Qwen · 52Alibaba · 47Anthropic · 44Microsoft · 35Meta · 30Mistral · 30DeepSeek · 28google · 19meta-llama · 19mistralai · 19Meta AI · 15CYFRAGOVPL · 14Zhipu AI · 13NVIDIA · 10SpeakLeash · 10internlm · 10xAI · 10ByteDance · 9Baidu · 8PLLuM · 8ibm-granite · 8microsoft · 8Amazon · 7Google DeepMind · 7MiniMax · 7Mistral AI · 7Remek · 7Shanghai AI Lab · 7allenai · 7utter-project · 7CohereForAI · 6Microsoft Research · 6Salesforce · 601-ai · 5Alibaba Cloud · 5Cohere · 5Moonshot AI · 5NousResearch · 5THUML · 5deepseek-ai · 5DeepMind · 4Facebook AI · 4IBM · 4Meituan · 4Stanford · 4THUDM · 4UC San Diego · 4VikParuchuri · 4gguf-iq · 4nvidia · 4openchat · 4tiiuae · 4Allen AI · 3BAAI · 3Du et al. · 3ForgeCode · 3Fudan University · 3IDEA Research · 3Liao et al. · 3Moonshot.AI · 3Nam Tuan Ly / NII · 3OPI-PG · 3OpenDataLab · 3ViCoS Lab Ljubljana · 3Xiaomi · 3Zhao et al. · 3gguf · 3gguf11bv30 · 3gguf7bv30 · 3upstage · 3+ 247 smaller vendors (291 models)
§ 01 · Computer Vision models

896 models in Computer Vision · page 11 of 18.

#ModelVendorParametersArchitectureSOTABenchmarksResults
501OrigamiNet-24UnknownUnknownUnknown12
502PyLaia (all transcriptions + agreement-based split)UnknownUnknownUnknown12
503PyLaia (human transcriptions + agreement-based split)UnknownUnknownUnknown12
504PyLaia (rover consensus + agreement-based split)UnknownUnknownUnknown12
505Qwen2.5-VL 32BAlibabaVision-Language Model22
506Qwen3-VL-4BAlibaba Qwen4BVision-Language Model (4B params)22
507ReasTAP-LargeUnknownUnknownUnknown12
508SANA12
509SIGA_SUnknownUnknownUnknown22
510SLANetUnknownUnknownUnknown12
511SSD300 (VGG-16)Google / UNC~24MSingle-shot multibox detector with VGG-16 backbone, 300x300 input12
512Salience-aware TAPASUnknownUnknownUnknown12
513SwinTextSpotter v2AcademicSwin Transformer, improved detection-recognition synergy12
514T5-3b(UnifiedSKG)UnknownUnknownUnknown12
515TABLETAnonymous (arXiv 2025)UnknownDual Transformer encoders; encoder-only architecture; row/column splitting as sequence labeling12
516TAPAS-Large classifier with Counterfactual + Synthetic pre-trainingUnknownUnknownUnknown12
517TAPEX-LargeUnknownUnknownUnknown12
518TPSNetUnknownUnknownUnknown12
519TRUSTUnknownUnknownUnknown12
520TabStruct-NetUnknownUnknownUnknown12
521Table NLMUnknownUnknownUnknown12
522Table-BERT-Horizontal-T+F-TemplateUnknownUnknownUnknown12
523UniTable LargeGeorgia Tech (Peng et al.)UnknownViT encoder + autoregressive decoder; self-supervised pretraining on unannotated tabular images12
524VAI-OCRUnknownUnknownUnknown12
525ViT-B/16Google86MVision Transformer22
526ViTDet-H (MAE)Meta AIUnknownPlain ViT-H backbone with simple feature pyramid, Cascade Mask RCNN head12
527VideoPrism-g22
528biCVM+UnknownUnknownUnknown22
529claude-3.5-sonnetUnknownUnknownUnknown12
530dots.mocr22
531gpt-4o-2024UnknownUnknownUnknown12
532minicpm-v-4.5-8bUnknownUnknownUnknown12
533mistral-ocr-2512UnknownUnknownUnknown22
534olmOCR v0.3.0Allen AIOCR Pipeline12
535sail-vl2-8bUnknownUnknownUnknown12
536 Self-Attention + CTC + language modelUnknownUnknownUnknown11
5373DGPUnknownUnknownUnknown11
538ABCNet v2TPAMI 202111
539AIMv2-3BApple2.7BVision Transformer (Autoregressive Pre-trained)11
540AIN 7BResearchVision-Language Model11
541ARTEMIS-DAUnknownUnknownUnknown11
542AWS TextractAmazon Web ServicesUnknownManaged OCR + layout + table extraction service11
543AbdallahUnknownUnknownUnknown11
544AlexNetU. Toronto11
545AlexNet + spatial pyramidal pooling + image resizingUnknownUnknownUnknown11
546Anthropic Haiku 4.5AnthropicUnknownVision-language model (thinking enabled)11
547ArabicNougatcommunity11
548ArtDet-v2Sogou OCR teamUnknownScene text detector11
549AttentionOCR_Inception-resnet-v2_LocationUnknownUnknownUnknown11
550Azure Document IntelligenceMicrosoftUnknownManaged layout + OCR extraction service11