Codesota · Models1,357 models indexed · 896 match filter
Editorial · Models
Every model, measured.
Start with a research area, drill into a vendor, or page through the full index. Only models with at least one benchmark score appear — a model without a recorded score can’t be ranked.
Vendor:Areas overviewspeakleash · 253OpenAI · 85Google · 71Qwen · 52Alibaba · 47Anthropic · 44Microsoft · 35Meta · 30Mistral · 30DeepSeek · 28google · 19meta-llama · 19mistralai · 19Meta AI · 15CYFRAGOVPL · 14Zhipu AI · 13NVIDIA · 10SpeakLeash · 10internlm · 10xAI · 10ByteDance · 9Baidu · 8PLLuM · 8ibm-granite · 8microsoft · 8Amazon · 7Google DeepMind · 7MiniMax · 7Mistral AI · 7Remek · 7Shanghai AI Lab · 7allenai · 7utter-project · 7CohereForAI · 6Microsoft Research · 6Salesforce · 601-ai · 5Alibaba Cloud · 5Cohere · 5Moonshot AI · 5NousResearch · 5THUML · 5deepseek-ai · 5DeepMind · 4Facebook AI · 4IBM · 4Meituan · 4Stanford · 4THUDM · 4UC San Diego · 4VikParuchuri · 4gguf-iq · 4nvidia · 4openchat · 4tiiuae · 4Allen AI · 3BAAI · 3Du et al. · 3ForgeCode · 3Fudan University · 3IDEA Research · 3Liao et al. · 3Moonshot.AI · 3Nam Tuan Ly / NII · 3OPI-PG · 3OpenDataLab · 3ViCoS Lab Ljubljana · 3Xiaomi · 3Zhao et al. · 3gguf · 3gguf11bv30 · 3gguf7bv30 · 3upstage · 3+ 247 smaller vendors (291 models)
§ 01 · Computer Vision models
896 models in Computer Vision · page 10 of 18.
| # | Model | Vendor | Parameters | Architecture | SOTA | Benchmarks | Results |
|---|---|---|---|---|---|---|---|
| 451 | ABINet++ | Unknown | Unknown | Unknown | 1 | 2 | |
| 452 | AON | Unknown | Unknown | Unknown | 2 | 2 | |
| 453 | BERT classifier w/o Table | Unknown | Unknown | Unknown | 1 | 2 | |
| 454 | Bert-to-Bert | — | — | — | 1 | 2 | |
| 455 | Bi+ | Unknown | Unknown | Unknown | 2 | 2 | |
| 456 | CA-FCN | Unknown | Unknown | Unknown | 2 | 2 | |
| 457 | CHAR | Unknown | Unknown | Unknown | 2 | 2 | |
| 458 | CNKI | Unknown | Unknown | Unknown | 1 | 2 | |
| 459 | CodeBERT | Microsoft | Unknown | BERT | 2 | 2 | |
| 460 | CodeTrans-TF-Large | Unknown | Unknown | Unknown | 2 | 2 | |
| 461 | ConvNeXt V2 Huge | Meta | 650M | CNN | 2 | 2 | |
| 462 | DEER | Unknown | Unknown | Unknown | 1 | 2 | |
| 463 | DINO (ViT-B/8) | — | — | — | 2 | 2 | |
| 464 | DNTextSpotter (ResNet-50) | Academic | — | Denoising Transformer, ResNet-50 backbone | 1 | 2 | |
| 465 | DNTextSpotter (ViTAEv2-S) | Academic | — | Denoising Transformer, ViTAEv2-S backbone | 1 | 2 | |
| 466 | DRetHTR-base | Unknown | Unknown | Decoder-only Retentive Network (RetNet) | 1 | 2 | |
| 467 | Decouple Attention Network | Unknown | Unknown | Unknown | 1 | 2 | |
| 468 | DeepSeek-OCR-2 | — | — | — | 2 | 2 | |
| 469 | DeiT-B Distilled | Meta | 86M | Vision Transformer | 2 | 2 | |
| 470 | EVA-02-L | BAAI / PKU | — | ViT-L with EVA-CLIP pre-training (O365 + ImageNet-21K) | 1 | 2 | |
| 471 | EfficientNet-B7 | 66M | CNN | 2 | 2 | ||
| 472 | FA+RL | — | — | — | 1 | 2 | |
| 473 | Falcon-OCR | — | — | — | 2 | 2 | |
| 474 | Faster R-CNN (VGG-16) | Microsoft Research | ~137M | Two-stage detector: RPN + Fast R-CNN with VGG-16 backbone | 1 | 2 | |
| 475 | Field-gating Seq2seq + dual attention | Unknown | Unknown | Unknown | 1 | 2 | |
| 476 | Field-gating Seq2seq + dual attention + beam search | Unknown | Unknown | Unknown | 1 | 2 | |
| 477 | Florence-2-Large | Microsoft | — | — | 1 | 2 | |
| 478 | GLIPv2-H (fine-tuned) | — | — | — | 2 | 2 | |
| 479 | GridFormer | Anonymous (arXiv 2023) | Unknown | Grid vertex and edge prediction via Transformer; handles unconstrained table structures | 1 | 2 | |
| 480 | HEADoC-Base | — | 27.7M | Transformer | 2 | 2 | |
| 481 | HTR-JAND | Unknown | 1.5M | CNN (FullGatedConv2d + SE blocks) + Combined Attention + Knowledge Distillation | 1 | 2 | |
| 482 | IAST | Academic | — | Reading-Order Estimation + Dynamic Sampling | 1 | 2 | |
| 483 | Infinity-Parser 7B | Unknown | 7B | Vision-Language Model | 1 | 2 | |
| 484 | InternViT-6B (InternVL) | OpenGVLab | 6B | InternViT vision encoder (InternVL family) | 2 | 2 | |
| 485 | KOSMOS-2.5 | Microsoft | — | — | 1 | 2 | |
| 486 | LGPMA | Unknown | Unknown | Unknown | 1 | 2 | |
| 487 | LRANet++ | Academic | — | Low-Rank Approximation Network | 1 | 2 | |
| 488 | LSGSpotter | Academic | — | Arbitrary Reading Order, Local Semantics Guidance | 1 | 2 | |
| 489 | LSTM-reg (single model) | Unknown | Unknown | Unknown | 2 | 2 | |
| 490 | LeJEPA ViT-L (304M) | — | — | — | 2 | 2 | |
| 491 | Leaky LP Cell | Unknown | Unknown | Unknown | 1 | 2 | |
| 492 | Marker 1.10.0 | VikParuchuri | — | PDF Parser | 1 | 2 | |
| 493 | MaskTextSpotter v2 | Unknown | Unknown | Unknown | 1 | 2 | |
| 494 | MetaWriter | Unknown | Unknown | Meta-learned prompt tuning on HTR backbone (CVPR 2025) | 1 | 2 | |
| 495 | MinerU2.5 | — | — | — | 2 | 2 | |
| 496 | MonkeyOCR-3B | MonkeyOCR | — | — | 1 | 2 | |
| 497 | Multi-Task Learning Model | Unknown | Unknown | Unknown | 1 | 2 | |
| 498 | NCP+BTA | — | — | — | 1 | 2 | |
| 499 | NRTR+TPS++ | Unknown | Unknown | Unknown | 2 | 2 | |
| 500 | OrigamiNet-18 | Unknown | Unknown | Unknown | 1 | 2 |