Codesota · Models1,357 models indexed · 896 match filter
Editorial · Models
Every model, measured.
Start with a research area, drill into a vendor, or page through the full index. Only models with at least one benchmark score appear — a model without a recorded score can’t be ranked.
Vendor:Areas overviewspeakleash · 253OpenAI · 85Google · 71Qwen · 52Alibaba · 47Anthropic · 44Microsoft · 35Meta · 30Mistral · 30DeepSeek · 28google · 19meta-llama · 19mistralai · 19Meta AI · 15CYFRAGOVPL · 14Zhipu AI · 13NVIDIA · 10SpeakLeash · 10internlm · 10xAI · 10ByteDance · 9Baidu · 8PLLuM · 8ibm-granite · 8microsoft · 8Amazon · 7Google DeepMind · 7MiniMax · 7Mistral AI · 7Remek · 7Shanghai AI Lab · 7allenai · 7utter-project · 7CohereForAI · 6Microsoft Research · 6Salesforce · 601-ai · 5Alibaba Cloud · 5Cohere · 5Moonshot AI · 5NousResearch · 5THUML · 5deepseek-ai · 5DeepMind · 4Facebook AI · 4IBM · 4Meituan · 4Stanford · 4THUDM · 4UC San Diego · 4VikParuchuri · 4gguf-iq · 4nvidia · 4openchat · 4tiiuae · 4Allen AI · 3BAAI · 3Du et al. · 3ForgeCode · 3Fudan University · 3IDEA Research · 3Liao et al. · 3Moonshot.AI · 3Nam Tuan Ly / NII · 3OPI-PG · 3OpenDataLab · 3ViCoS Lab Ljubljana · 3Xiaomi · 3Zhao et al. · 3gguf · 3gguf11bv30 · 3gguf7bv30 · 3upstage · 3+ 247 smaller vendors (291 models)
§ 01 · Computer Vision models
896 models in Computer Vision · page 3 of 18.
| # | Model | Vendor | Parameters | Architecture | SOTA | Benchmarks | Results |
|---|---|---|---|---|---|---|---|
| 101 | Mistral OCR 3 | Mistral | — | Vision-Language Model | 1 | 2 | 6 |
| 102 | REL-RWMD k-NN | Unknown | Unknown | Unknown | 1 | 6 | 6 |
| 103 | SBD | Unknown | Unknown | Unknown | 1 | 2 | 6 |
| 104 | TRDLU | Unknown | Unknown | Unknown | 1 | 1 | 6 |
| 105 | TextBoxes++_MS | Unknown | Unknown | Unknown | 1 | 2 | 6 |
| 106 | TextMamba | Zhao et al. | Unknown | Mamba (SSM) + CNN | 1 | 2 | 6 |
| 107 | Transformer | Unknown | Unknown | Unknown | 1 | 6 | 6 |
| 108 | VSR | Unknown | Unknown | Unknown | 1 | 1 | 6 |
| 109 | BiLSTM (Europarl) | Unknown | Unknown | Unknown | 1 | 5 | 5 |
| 110 | EasyOCR | JaidedAI | — | Deep Learning OCR | 1 | 3 | 5 |
| 111 | GPT-2-Medium (prefix-tuning) | OpenAI | 355M | Transformer | 1 | 1 | 5 |
| 112 | Infinity-Parser2-Pro | — | — | — | 1 | 5 | 5 |
| 113 | PaddleOCR-VL | Baidu | 0.9B-7B | Vision-Language Model | 1 | 2 | 5 |
| 114 | AIMv2 ViT-3B/14 + Llama 3.0 8B | — | — | — | 1 | 4 | 4 |
| 115 | Bottom-Up Sum | Unknown | Unknown | Unknown | 1 | 1 | 4 |
| 116 | FAST-T-448 | Unknown | Unknown | Unknown | 1 | 1 | 4 |
| 117 | Yet Another Text Recognizer | Unknown | Unknown | Unknown | 1 | 4 | 4 |
| 118 | APE-Large | Tsinghua / MEGVII | Unknown | Aligned vision encoder + region-text alignment with EVA-02 ViT-L backbone | 1 | 1 | 3 |
| 119 | ASNMF-SRP | Zhong and Gao | — | — | 1 | 1 | 3 |
| 120 | CharNet H-88 (single-scale) | Unknown | Unknown | Unknown | 1 | 1 | 3 |
| 121 | DBNet++ (ResNet-50) (1024) | Liao et al. | Unknown | ResNet-50 + Differentiable Binarization + Adaptive Scale Fusion | 1 | 1 | 3 |
| 122 | DeepSolo (with pre-training) | ViTAE-Transformer | Unknown | DETR-like Transformer decoder with explicit points | 1 | 1 | 3 |
| 123 | EoMT (ViT-L) | — | — | — | 1 | 3 | 3 |
| 124 | InternImage-H | Shanghai AI Lab | — | Deformable Convolution | 1 | 3 | 3 |
| 125 | KB-to-Language Generation Model | Unknown | Unknown | Unknown | 1 | 1 | 3 |
| 126 | MinerU 2.5 | OpenDataLab | — | Document extraction pipeline | 1 | 2 | 3 |
| 127 | Pixel-level RC | Unknown | Unknown | Unknown | 1 | 3 | 3 |
| 128 | RapidOCR | Unknown | Unknown | Unknown | 1 | 1 | 3 |
| 129 | Re0 | Unknown | Unknown | Unknown | 1 | 1 | 3 |
| 130 | SumHiS | SumHiS Authors | — | Extractive summarization exploiting hidden document structure | 1 | 1 | 3 |
| 131 | V-JEPA 2 ViT-g (1B, 384px) | — | — | — | 1 | 3 | 3 |
| 132 | BEiT-L+ | — | — | — | 1 | 2 | 2 |
| 133 | BioGPT | Unknown | Unknown | Unknown | 1 | 1 | 2 |
| 134 | BioLinkBERT (large) | Unknown | Unknown | Unknown | 1 | 1 | 2 |
| 135 | CodeT5-base | Salesforce | — | T5 encoder-decoder pretrained on code | 1 | 2 | 2 |
| 136 | DTrOCR | Unknown | Unknown | Unknown | 1 | 2 | 2 |
| 137 | Gemini 2.0 Flash | — | Multimodal LLM | 1 | 2 | 2 | |
| 138 | HEADoC-Large | — | 90.58M | Transformer | 1 | 2 | 2 |
| 139 | I2L-STRIPS | Unknown | Unknown | Unknown | 1 | 2 | 2 |
| 140 | LBDM | Unknown | Unknown | Unknown | 1 | 1 | 2 |
| 141 | MAGNET | Unknown | Unknown | Unknown | 1 | 2 | 2 |
| 142 | MBD | Unknown | Unknown | Unknown | 1 | 1 | 2 |
| 143 | PASTA | Unknown | Unknown | Unknown | 1 | 1 | 2 |
| 144 | SEMv3 | IFLYTEK / USTC (Zhang et al.) | Unknown | Keypoint Offset Regression (KOR) module; split-and-merge paradigm for table separation line detection | 1 | 1 | 2 |
| 145 | SPECTER | Unknown | Unknown | Unknown | 1 | 2 | 2 |
| 146 | SciNCL | Unknown | Unknown | Unknown | 1 | 2 | 2 |
| 147 | Selective Search | Unknown | Unknown | Unknown | 1 | 1 | 2 |
| 148 | Span | Unknown | Unknown | Unknown | 1 | 1 | 2 |
| 149 | Start, Follow, Read | Unknown | Unknown | Unknown | 1 | 1 | 2 |
| 150 | StrucTexTv2 (small) | Unknown | Unknown | Unknown | 1 | 2 | 2 |