Codesota · Models1,368 models indexed · 88 match filter
Editorial · Models

Every model, measured.

Start with a research area, drill into a vendor, or page through the full index. Only models with at least one benchmark score appear — a model without a recorded score can’t be ranked.

Vendor:Areas overviewspeakleash · 253OpenAI · 85Google · 71Qwen · 60Alibaba · 47Anthropic · 44Microsoft · 35Meta · 30Mistral · 30DeepSeek · 28google · 19meta-llama · 19mistralai · 19Meta AI · 15CYFRAGOVPL · 14Zhipu AI · 13NVIDIA · 10SpeakLeash · 10internlm · 10xAI · 10ByteDance · 9Baidu · 8PLLuM · 8ibm-granite · 8microsoft · 8Amazon · 7Google DeepMind · 7MiniMax · 7Mistral AI · 7Remek · 7Shanghai AI Lab · 7allenai · 7utter-project · 7CohereForAI · 6Microsoft Research · 6Salesforce · 601-ai · 5Alibaba Cloud · 5Cohere · 5Moonshot AI · 5NousResearch · 5THUML · 5deepseek-ai · 5DeepMind · 4Facebook AI · 4IBM · 4Meituan · 4Stanford · 4THUDM · 4UC San Diego · 4VikParuchuri · 4Xiaomi · 4gguf-iq · 4nvidia · 4openchat · 4tiiuae · 4Allen AI · 3BAAI · 3Du et al. · 3ForgeCode · 3Fudan University · 3IDEA Research · 3Liao et al. · 3Moonshot.AI · 3Nam Tuan Ly / NII · 3OPI-PG · 3OpenDataLab · 3StepFun · 3ViCoS Lab Ljubljana · 3Zhao et al. · 3gguf · 3gguf11bv30 · 3gguf7bv30 · 3upstage · 3+ 246 smaller vendors (290 models)
§ 01 · Multimodal models

88 models in Multimodal · page 1 of 2.

#ModelVendorParametersArchitectureSOTABenchmarksResults
001GPT-4oOpenAIUndisclosedMultimodal LLM154557
002Gemini-3.1-ProGoogle4311
003Gemini 1.5 ProGoogleMultimodal LLM31721
004Qianfan-OCRBaidu Qianfan4BEnd-to-end VLM (4B params)3416
005Qwen3.5-Omni-Plus31010
006Claude 3.5 SonnetAnthropicUndisclosedMultimodal LLM22732
007Qwen3.5-397B-A17BAlibaba21420
008Qwen2-VL 72BAlibabaVision-Language Model21218
009Ovis2.5-9B289
010SenseNova-U1-A3B-MoTSenseTime278
011Qwen3.6 PlusAlibaba244
012BLIP ViT-L222
013Qwen3-VL-235B-A22B-InstructQwen11314
014Gemini 3 ProGoogleUndisclosed11113
015Qwen3.6-35B-A3B11111
016Qwen3.6-27B11010
017Gemini 2.5 Pro199
018Intern-S1-ProShanghai AI Lab168
019Audio Flamingo 3177
020Kimi K2.6166
021Infinity-Parser2-Pro155
022AIMv2 ViT-3B/14 + Llama 3.0 8B144
023BLIP-2SalesforceUnknownFrozen image encoder + Q-Former + frozen LLM133
024Gemini 2.0 FlashGoogleMultimodal LLM122
025BLIP3o-NEXT-GRPO-GenEval (3B)111
026Chameleon-SFT111
027Lumina-DiMOO w/ Self-GRPO111
028Qwen3.5-27BAlibaba1117
029Qwen3.5-35B-A3BAlibaba1117
030Kimi-K2.5Moonshot.AI1016
031Qwen3.5-122B-A10BAlibaba1016
032Qwen2.5-VL-72B1515
033Claude 3 OpusAnthropic814
034Qwen3-VL-235B-A22B-ThinkingQwen1314
035Qwen3-VL-8B-InstructQwen1314
036MiniCPM-o 4.5-Instruct1111
037Qwen2-VL 7BAlibaba7B1111
038Qwen2-VL-2B1010
039Gemini 2.5 Flash99
040InternVL2-76BShanghai AI Lab76BVision-Language Model58
041Aria77
042LongCat-Flash-Omni77
043VideoLLaMA3 7B77
044BLIP3-o (8B)66
045Gemma 3 (27B, IT)66
046Gemma 4 31BGoogle56
047InternVL3-78BShanghai AI Lab78BVision-Language Model56
048VideoLLaMA3 2B66
049Kimi-VL-A3B-Instruct55
050Kimi-VL-A3B-Thinking-250655