Codesota · Models1,357 models indexed · 88 match filter
Editorial · Models
Every model, measured.
Start with a research area, drill into a vendor, or page through the full index. Only models with at least one benchmark score appear — a model without a recorded score can’t be ranked.
Vendor:Areas overviewspeakleash · 253OpenAI · 85Google · 71Qwen · 52Alibaba · 47Anthropic · 44Microsoft · 35Meta · 30Mistral · 30DeepSeek · 28google · 19meta-llama · 19mistralai · 19Meta AI · 15CYFRAGOVPL · 14Zhipu AI · 13NVIDIA · 10SpeakLeash · 10internlm · 10xAI · 10ByteDance · 9Baidu · 8PLLuM · 8ibm-granite · 8microsoft · 8Amazon · 7Google DeepMind · 7MiniMax · 7Mistral AI · 7Remek · 7Shanghai AI Lab · 7allenai · 7utter-project · 7CohereForAI · 6Microsoft Research · 6Salesforce · 601-ai · 5Alibaba Cloud · 5Cohere · 5Moonshot AI · 5NousResearch · 5THUML · 5deepseek-ai · 5DeepMind · 4Facebook AI · 4IBM · 4Meituan · 4Stanford · 4THUDM · 4UC San Diego · 4VikParuchuri · 4gguf-iq · 4nvidia · 4openchat · 4tiiuae · 4Allen AI · 3BAAI · 3Du et al. · 3ForgeCode · 3Fudan University · 3IDEA Research · 3Liao et al. · 3Moonshot.AI · 3Nam Tuan Ly / NII · 3OPI-PG · 3OpenDataLab · 3ViCoS Lab Ljubljana · 3Xiaomi · 3Zhao et al. · 3gguf · 3gguf11bv30 · 3gguf7bv30 · 3upstage · 3+ 247 smaller vendors (291 models)
§ 01 · Multimodal models
88 models in Multimodal · page 2 of 2.
| # | Model | Vendor | Parameters | Architecture | SOTA | Benchmarks | Results |
|---|---|---|---|---|---|---|---|
| 051 | MiniCPM-V 4.6-Thinking (16x) | — | — | — | 5 | 5 | |
| 052 | Qwen2.5-VL 72B | Alibaba | 72B | Vision-Language Model | 5 | 5 | |
| 053 | ZAYA1-VL-8B | — | — | — | 5 | 5 | |
| 054 | GPT-4V | Unknown | Unknown | Transformer | 4 | 4 | |
| 055 | GPT-5.1 | OpenAI | — | — | 4 | 4 | |
| 056 | GPT-5.2 | OpenAI | — | — | 4 | 4 | |
| 057 | Gemma 4 31B | — | — | 4 | 4 | ||
| 058 | Llama 3-V (405B) | — | — | — | 4 | 4 | |
| 059 | ALIGN | — | — | — | 3 | 3 | |
| 060 | AltCLIP | — | — | — | 3 | 3 | |
| 061 | BAGEL (7B MoT) | — | — | — | 3 | 3 | |
| 062 | LLaVA-1.5 | UW-Madison / Microsoft | Unknown | CLIP ViT-L + MLP projector + Vicuna-13B | 3 | 3 | |
| 063 | MiniMax-VL-01 | — | — | — | 3 | 3 | |
| 064 | Qwen3-Omni-30B-A3B-Base-202507 | — | — | — | 3 | 3 | |
| 065 | Qwen3-Omni-Flash-Thinking | — | — | — | 3 | 3 | |
| 066 | qwen2.5-vl-7b | Unknown | Unknown | Unknown | 3 | 3 | |
| 067 | Flamingo (32-shot) | — | — | — | 2 | 2 | |
| 068 | GLIPv2-H (fine-tuned) | — | — | — | 2 | 2 | |
| 069 | GPT-5.1 Instant | OpenAI | — | — | 2 | 2 | |
| 070 | GPT-5.1 Thinking | OpenAI | — | — | 2 | 2 | |
| 071 | Llama 3.2 Vision 90B | Meta | Unknown | Llama 3.1 + cross-attention vision adapter | 2 | 2 | |
| 072 | LongVU | — | — | — | 2 | 2 | |
| 073 | Qwen3.5-122B-A10B | Alibaba Cloud | — | — | 2 | 2 | |
| 074 | Qwen3.5-27B | Alibaba Cloud | — | — | 2 | 2 | |
| 075 | Qwen3.5-397B-A17B | Alibaba | — | — | 2 | 2 | |
| 076 | AsymFLUX.2 klein | — | — | — | 1 | 1 | |
| 077 | BAGEL (7B MoT) with LLM rewriter | — | — | — | 1 | 1 | |
| 078 | BLIP CapFilt-L | — | — | — | 1 | 1 | |
| 079 | BLIP-2 ViT-g FlanT5 XXL | — | — | — | 1 | 1 | |
| 080 | BLIP-2 ViT-g OPT 6.7B | — | — | — | 1 | 1 | |
| 081 | Chameleon-MultiTask | — | — | — | 1 | 1 | |
| 082 | CoCa | Unknown | Image encoder + cross-attention + causal decoder | 1 | 1 | ||
| 083 | Emu3.5 (34B, AR) | — | — | — | 1 | 1 | |
| 084 | Grok-1.5V | — | — | — | 1 | 1 | |
| 085 | Lumina-DiMOO | — | — | — | 1 | 1 | |
| 086 | Qwen2.5-VL-3B | — | — | — | 1 | 1 | |
| 087 | SiLVR | — | — | — | 1 | 1 | |
| 088 | Spectral Progressive Diffusion (PixelGen, TF) | — | — | — | 1 | 1 |