Codesota · Models1,357 models indexed · 164 match filter
Editorial · Models

Every model, measured.

Start with a research area, drill into a vendor, or page through the full index. Only models with at least one benchmark score appear — a model without a recorded score can’t be ranked.

Vendor:Areas overviewspeakleash · 253OpenAI · 85Google · 71Qwen · 52Alibaba · 47Anthropic · 44Microsoft · 35Meta · 30Mistral · 30DeepSeek · 28google · 19meta-llama · 19mistralai · 19Meta AI · 15CYFRAGOVPL · 14Zhipu AI · 13NVIDIA · 10SpeakLeash · 10internlm · 10xAI · 10ByteDance · 9Baidu · 8PLLuM · 8ibm-granite · 8microsoft · 8Amazon · 7Google DeepMind · 7MiniMax · 7Mistral AI · 7Remek · 7Shanghai AI Lab · 7allenai · 7utter-project · 7CohereForAI · 6Microsoft Research · 6Salesforce · 601-ai · 5Alibaba Cloud · 5Cohere · 5Moonshot AI · 5NousResearch · 5THUML · 5deepseek-ai · 5DeepMind · 4Facebook AI · 4IBM · 4Meituan · 4Stanford · 4THUDM · 4UC San Diego · 4VikParuchuri · 4gguf-iq · 4nvidia · 4openchat · 4tiiuae · 4Allen AI · 3BAAI · 3Du et al. · 3ForgeCode · 3Fudan University · 3IDEA Research · 3Liao et al. · 3Moonshot.AI · 3Nam Tuan Ly / NII · 3OPI-PG · 3OpenDataLab · 3ViCoS Lab Ljubljana · 3Xiaomi · 3Zhao et al. · 3gguf · 3gguf11bv30 · 3gguf7bv30 · 3upstage · 3+ 247 smaller vendors (291 models)
§ 01 · Agentic AI models

164 models in Agentic AI · page 2 of 4.

#ModelVendorParametersArchitectureSOTABenchmarksResults
051NVIDIA-Nemotron-3-Super-120B-A12B-BF1666
052Step-3.5-Flash66
053GPT-4 Turbo (2024)OpenAIUnknownGPT-4 Turbo (gpt-4-turbo-2024-04-09)55
054GPT-4.1 miniOpenAItransformer55
055Gemini 2.5 ProGoogle45
056Kimi K2.5Moonshot AIUndisclosed45
057Kimi-VL-A3B-Instruct55
058MiniMax M2.5MiniMax229B35
059Qwen3.5-397B-A17B†Anthropic/OpenAI15
060Claude Opus 4.6AnthropicUndisclosed34
061Claude Sonnet 4.5Anthropic44
062GPT-4.5OpenAIUndisclosed34
063GPT-5.1OpenAI44
064GPT-5.2OpenAI44
065GPT-5.2OpenAIUndisclosed23
066Claude 3.5 HaikuAnthropic22
067Claude 3.7 SonnetAnthropic22
068Claude Computer UseAnthropicUnknownClaude 3.5 Sonnet with computer use tool12
069Claude Haiku 4.5Anthropic22
070Claude Opus 4.1Anthropic22
071Claude Sonnet 4.6Anthropic22
072DeepSeek-V2.5DeepSeekLLM22
073DeepSeek-V3.1DeepSeek22
074GLM-4.5Zhipu AI22
075GLM-4.5-AirZhipu AI22
076GPT-5 CodexOpenAI22
077GPT-5.1OpenAI22
078GPT-5.1 InstantOpenAI22
079GPT-5.1 ThinkingOpenAI22
080Gemini 2.5 FlashGoogleMultimodal LLM22
081Holo2-30B-A3B22
082Holo2-4B22
083Holo2-8B22
084Ling-2.6-1T22
085MiniMax M2MiniMax22
086MiniMax M2.1MiniMax22
087Muse SparkMeta22
088Qwen3-Coder 480B A35BAlibaba Cloud22
089Qwen3.5-122B-A10BAlibaba Cloud22
090Qwen3.5-27BAlibaba Cloud22
091Qwen3.5-397B-A17BAlibaba22
092Step-3.5-FlashStepFunUnknown22
093Agent Q (GPT-4o)MultiOnUnknownMCTS + DPO self-play web agent on GPT-4o11
094Agent S w/ Claude-3.511
095Agent S w/ GPT-4o11
096Agent S2 (Claude 3.7)Simular AI11
097Agent S2 (Gemini 2.5)Simular AI11
098Agent S2 w/ Claude-3.5-Sonnet11
099Agent S2 w/ Claude-3.7-Sonnet11
100Ante / Gemini 3 ProAnte11