Codesota · Models1,357 models indexed · 896 match filter
Editorial · Models
Every model, measured.
Start with a research area, drill into a vendor, or page through the full index. Only models with at least one benchmark score appear — a model without a recorded score can’t be ranked.
Vendor:Areas overviewspeakleash · 253OpenAI · 85Google · 71Qwen · 52Alibaba · 47Anthropic · 44Microsoft · 35Meta · 30Mistral · 30DeepSeek · 28google · 19meta-llama · 19mistralai · 19Meta AI · 15CYFRAGOVPL · 14Zhipu AI · 13NVIDIA · 10SpeakLeash · 10internlm · 10xAI · 10ByteDance · 9Baidu · 8PLLuM · 8ibm-granite · 8microsoft · 8Amazon · 7Google DeepMind · 7MiniMax · 7Mistral AI · 7Remek · 7Shanghai AI Lab · 7allenai · 7utter-project · 7CohereForAI · 6Microsoft Research · 6Salesforce · 601-ai · 5Alibaba Cloud · 5Cohere · 5Moonshot AI · 5NousResearch · 5THUML · 5deepseek-ai · 5DeepMind · 4Facebook AI · 4IBM · 4Meituan · 4Stanford · 4THUDM · 4UC San Diego · 4VikParuchuri · 4gguf-iq · 4nvidia · 4openchat · 4tiiuae · 4Allen AI · 3BAAI · 3Du et al. · 3ForgeCode · 3Fudan University · 3IDEA Research · 3Liao et al. · 3Moonshot.AI · 3Nam Tuan Ly / NII · 3OPI-PG · 3OpenDataLab · 3ViCoS Lab Ljubljana · 3Xiaomi · 3Zhao et al. · 3gguf · 3gguf11bv30 · 3gguf7bv30 · 3upstage · 3+ 247 smaller vendors (291 models)
§ 01 · Computer Vision models
896 models in Computer Vision · page 6 of 18.
| # | Model | Vendor | Parameters | Architecture | SOTA | Benchmarks | Results |
|---|---|---|---|---|---|---|---|
| 251 | Massively Multilingual Sentence Embeddings | Unknown | Unknown | Unknown | 7 | 7 | |
| 252 | MultiCCA + CNN | Unknown | Unknown | Unknown | 7 | 7 | |
| 253 | PARSeq | Research | Unknown | Scene Text Recognition with Permuted Autoregressive Sequence Models | 6 | 7 | |
| 254 | SRFormer (ResNet-50) | Unknown | Unknown | Unknown | 3 | 7 | |
| 255 | VideoLLaMA3 7B | — | — | — | 7 | 7 | |
| 256 | pre-train w/ code only | Unknown | Unknown | Unknown | 7 | 7 | |
| 257 | seq2seq | Unknown | Unknown | Unknown | 7 | 7 | |
| 258 | CDistNet (Ours) | Unknown | Unknown | Unknown | 6 | 6 | |
| 259 | CRNN | Unknown | Unknown | Unknown | 5 | 6 | |
| 260 | CharNet H-88 | Unknown | Unknown | Unknown | 2 | 6 | |
| 261 | CharNet H-88 (multi-scale) | Unknown | Unknown | Unknown | 2 | 6 | |
| 262 | DPAN | Unknown | Unknown | Unknown | 6 | 6 | |
| 263 | DiffusionSTR | Unknown | Unknown | Unknown | 6 | 6 | |
| 264 | EK-Net | Zhu et al. | Unknown | ResNet-18 + Expand Kernel Distance | 2 | 6 | |
| 265 | FOTS MS | Unknown | Unknown | Unknown | 2 | 6 | |
| 266 | FTSN + MNMS | Unknown | Unknown | Unknown | 2 | 6 | |
| 267 | GLAM | Unknown | Unknown | Unknown | 1 | 6 | |
| 268 | GNNets | Unknown | Unknown | Unknown | 2 | 6 | |
| 269 | HTR-ConvText | DAIR-Group | 65.9M | CNN+Transformer hybrid (ConvText block) | 3 | 6 | |
| 270 | HTR-VT | Unknown | Unknown | Unknown | 3 | 6 | |
| 271 | InternVL3-78B | Shanghai AI Lab | 78B | Vision-Language Model | 5 | 6 | |
| 272 | LayoutLMv3-B | Unknown | Unknown | Unknown | 1 | 6 | |
| 273 | PAN-640 | Unknown | Unknown | Unknown | 2 | 6 | |
| 274 | PixelLink+VGG16 2s | Unknown | Unknown | Unknown | 2 | 6 | |
| 275 | ResNext-101-32×8d | Unknown | Unknown | Unknown | 1 | 6 | |
| 276 | S-GTR | Unknown | Unknown | Unknown | 6 | 6 | |
| 277 | SLPR | Unknown | Unknown | Unknown | 2 | 6 | |
| 278 | TextBPN++ (ResNet-50+DCN) | Zhang et al. | Unknown | ResNet-50 with Deformable Convolution + Boundary Transformer | 2 | 6 | |
| 279 | TrOCR-base 334M | Unknown | Unknown | Unknown | 6 | 6 | |
| 280 | TrOCR-large 558M | Unknown | Unknown | Unknown | 6 | 6 | |
| 281 | UDoc | Unknown | Unknown | Unknown | 1 | 6 | |
| 282 | VAN | Unknown | Unknown | Unknown | 3 | 6 | |
| 283 | VideoLLaMA3 2B | — | — | — | 6 | 6 | |
| 284 | WordSup (VGG16-synth-icdar) | Unknown | Unknown | Unknown | 2 | 6 | |
| 285 | ABINet-LV | Fang et al. | Unknown | ResNet + Bidirectional Language Model (LV) | 5 | 5 | |
| 286 | BART-base (STSM) | Meta | 139M | Transformer | 1 | 5 | |
| 287 | CodeBERT (RTD) | Unknown | Unknown | Unknown | 5 | 5 | |
| 288 | DPText-DETR (ResNet-50) | Unknown | Unknown | Unknown | 2 | 5 | |
| 289 | FLAN-T5-base (STSM) | 250M | Transformer | 1 | 5 | ||
| 290 | FactJointGT | Unknown | Unknown | Unknown | 1 | 5 | |
| 291 | GLASS | Unknown | Unknown | Unknown | 2 | 5 | |
| 292 | GPT-2-Medium (fine-tuning) | OpenAI | 355M | Transformer | 1 | 5 | |
| 293 | HTLM (prefix-tuning) | Unknown | Unknown | Transformer | 1 | 5 | |
| 294 | JointGT Baseline | Unknown | Unknown | Unknown | 1 | 5 | |
| 295 | MaskTextSpotter v3 | Unknown | Unknown | Unknown | 2 | 5 | |
| 296 | MiniCPM-Llama3-V 2.5 | — | — | — | 5 | 5 | |
| 297 | MiniCPM-V 4.6-Thinking (16x) | — | — | — | 5 | 5 | |
| 298 | Qwen2.5-VL 72B | Alibaba | 72B | Vision-Language Model | 5 | 5 | |
| 299 | SIGA_T | Unknown | Unknown | Unknown | 5 | 5 | |
| 300 | SPTS v2 | Unknown | Unknown | Unknown | 2 | 5 |