Codesota · Models1,357 models indexed · 896 match filter
Editorial · Models

Every model, measured.

Start with a research area, drill into a vendor, or page through the full index. Only models with at least one benchmark score appear — a model without a recorded score can’t be ranked.

Vendor:Areas overviewspeakleash · 253OpenAI · 85Google · 71Qwen · 52Alibaba · 47Anthropic · 44Microsoft · 35Meta · 30Mistral · 30DeepSeek · 28google · 19meta-llama · 19mistralai · 19Meta AI · 15CYFRAGOVPL · 14Zhipu AI · 13NVIDIA · 10SpeakLeash · 10internlm · 10xAI · 10ByteDance · 9Baidu · 8PLLuM · 8ibm-granite · 8microsoft · 8Amazon · 7Google DeepMind · 7MiniMax · 7Mistral AI · 7Remek · 7Shanghai AI Lab · 7allenai · 7utter-project · 7CohereForAI · 6Microsoft Research · 6Salesforce · 601-ai · 5Alibaba Cloud · 5Cohere · 5Moonshot AI · 5NousResearch · 5THUML · 5deepseek-ai · 5DeepMind · 4Facebook AI · 4IBM · 4Meituan · 4Stanford · 4THUDM · 4UC San Diego · 4VikParuchuri · 4gguf-iq · 4nvidia · 4openchat · 4tiiuae · 4Allen AI · 3BAAI · 3Du et al. · 3ForgeCode · 3Fudan University · 3IDEA Research · 3Liao et al. · 3Moonshot.AI · 3Nam Tuan Ly / NII · 3OPI-PG · 3OpenDataLab · 3ViCoS Lab Ljubljana · 3Xiaomi · 3Zhao et al. · 3gguf · 3gguf11bv30 · 3gguf7bv30 · 3upstage · 3+ 247 smaller vendors (291 models)
§ 01 · Computer Vision models

896 models in Computer Vision · page 1 of 18.

#ModelVendorParametersArchitectureSOTABenchmarksResults
001GPT-4oOpenAIUndisclosedMultimodal LLM154557
002HTLM (fine-tuning)UnknownUnknownUnknown11520
003fglihaiUnknownUnknownUnknown11212
004GPT-2-Large (fine-tuning)UnknownUnknownUnknown7520
005ELSCUnknownUnknownUnknown777
006Hybrid DLA (Shehzadi et al.)DFKI / TU KaiserslauternUnknownTransformer object detector with query encoding + hybrid one-to-one/one-to-many matching616
007StackMix+BlotsUnknownUnknownUnknown666
008TextFuseNet (ResNeXt-101)UnknownUnknownUnknown5616
009USYD NLP_CS29-2UnknownUnknownUnknown516
010XLMft UDAUnknownUnknownUnknown555
011CRAFTUnknownUnknownUnknown4721
012CLIP4STR-L (DataComp-1B)UnknownUnknownUnknown499
013DINOv3 (7B)488
014DTrOCR 105MUnknownUnknownUnknown488
015ApproxRepSetUnknownUnknownUnknown466
016CCD-ViT-SmallUnknownUnknownUnknown445
017T5B BaselineUnknownUnknownUnknown415
018GFCNUnknownUnknownUnknown424
019Claude Sonnet 4AnthropicMultimodal LLM31521
020Gemini 1.5 ProGoogleMultimodal LLM31721
021Gemini 2.5 ProGoogleMultimodal LLM31516
022Qianfan-OCRBaidu Qianfan4BEnd-to-end VLM (4B params)3416
023LightOnOCR-2-1BLightOn1BVision-Language Model (1B params)319
024EDDUnknownUnknownUnknown337
025VGTUnknownUnknownUnknown327
026BRIOYale NLPUnknownBART-large with contrastive learning objective326
027BigBird-PegasusUnknownUnknownUnknown326
028Habitat-WebUnknownUnknownUnknown326
029UNITSUnknownUnknownUnknown325
030Optimized Text CNNUnknownUnknownUnknown324
031AKHCRNetUnknownUnknownUnknown313
032BPDOZheng et al.UnknownResNet-50 + FPN + DCN + Text-Aware Module + Dynamic Optimization Module313
033CPN (Complementary Proposal Network)Longhuang Wu et al.UnknownDeformable Morphology Semantic Network + Balanced Region Proposal Network + Interleaved Feature Attention313
034CodeTrans-MT-BaseUnknownUnknownUnknown333
035ContourNet [69]UnknownUnknownUnknown313
036DAT-SEGWan et al. (Baidu)UnknownInteractive attention transformer with segmentation head for multi-granularity text detection313
037HierarchicalEncoder + NR + IRUnknownUnknownUnknown313
038PCGAN-CHARUnknownUnknownUnknown333
039Segment Anything Model (SAM)Unknown333
040Claude 3.5 SonnetAnthropicUndisclosedMultimodal LLM22732
041Qwen3.5-397B-A17BAlibaba21420
042Faster R-CNNMicrosoft ResearchUnknownUnknown2419
043Qwen2-VL 72BAlibabaVision-Language Model21218
044CLIP4STR-LUnknownUnknownUnknown21010
045DANUnknownUnknownUnknown2710
046Chandra v0.1.0datalab-to9BVision-Language OCR Model219
047Ovis2.5-9B289
048DETRMeta AI / FAIRUnknownUnknown228
049FAST-T-512UnknownUnknownUnknown228
050DeepSolo (ViTAEv2-S, TextOCR)UnknownUnknownUnknown237