Codesota · Models1,357 models indexed · 842 match filter
Editorial · Models
Every model, measured.
Start with a research area, drill into a vendor, or page through the full index. Only models with at least one benchmark score appear — a model without a recorded score can’t be ranked.
Vendor:Areas overviewspeakleash · 253OpenAI · 85Google · 71Qwen · 52Alibaba · 47Anthropic · 44Microsoft · 35Meta · 30Mistral · 30DeepSeek · 28google · 19meta-llama · 19mistralai · 19Meta AI · 15CYFRAGOVPL · 14Zhipu AI · 13NVIDIA · 10SpeakLeash · 10internlm · 10xAI · 10ByteDance · 9Baidu · 8PLLuM · 8ibm-granite · 8microsoft · 8Amazon · 7Google DeepMind · 7MiniMax · 7Mistral AI · 7Remek · 7Shanghai AI Lab · 7allenai · 7utter-project · 7CohereForAI · 6Microsoft Research · 6Salesforce · 601-ai · 5Alibaba Cloud · 5Cohere · 5Moonshot AI · 5NousResearch · 5THUML · 5deepseek-ai · 5DeepMind · 4Facebook AI · 4IBM · 4Meituan · 4Stanford · 4THUDM · 4UC San Diego · 4VikParuchuri · 4gguf-iq · 4nvidia · 4openchat · 4tiiuae · 4Allen AI · 3BAAI · 3Du et al. · 3ForgeCode · 3Fudan University · 3IDEA Research · 3Liao et al. · 3Moonshot.AI · 3Nam Tuan Ly / NII · 3OPI-PG · 3OpenDataLab · 3ViCoS Lab Ljubljana · 3Xiaomi · 3Zhao et al. · 3gguf · 3gguf11bv30 · 3gguf7bv30 · 3upstage · 3+ 247 smaller vendors (291 models)
§ 01 · Natural Language Processing models
842 models in Natural Language Processing · page 1 of 17.
| # | Model | Vendor | Parameters | Architecture | SOTA | Benchmarks | Results |
|---|---|---|---|---|---|---|---|
| 001 | GPT-4o | OpenAI | Undisclosed | Multimodal LLM | 15 | 45 | 57 |
| 002 | Gemini-3.1-Pro-Preview | — | — | 7 | 1 | 7 | |
| 003 | Gemma 3 (27B, IT) | — | — | 6 | 1 | 9 | |
| 004 | mistralai/Mistral-Large-Instruct-2411 | mistralai | 123B | — | 4 | 3 | 17 |
| 005 | Claude Sonnet 4 | Anthropic | — | Multimodal LLM | 3 | 15 | 21 |
| 006 | Gemini 1.5 Pro | — | Multimodal LLM | 3 | 17 | 21 | |
| 007 | GPT-4 | OpenAI | — | Transformer (LLM) | 3 | 6 | 13 |
| 008 | BRIO | Yale NLP | Unknown | BART-large with contrastive learning objective | 3 | 2 | 6 |
| 009 | DeBERTa-v3-large | Microsoft | 304M | DeBERTa-v3-large | 3 | 5 | 6 |
| 010 | DeepSeek-V4-Pro Max | DeepSeek | — | — | 3 | 5 | 5 |
| 011 | Claude 3.5 Sonnet | Anthropic | Undisclosed | Multimodal LLM | 2 | 27 | 32 |
| 012 | Claude Opus 4 | Anthropic | Undisclosed | — | 2 | 16 | 23 |
| 013 | Qwen3.5-397B-A17B | Alibaba | — | — | 2 | 14 | 20 |
| 014 | Phi-4 | Microsoft | 14B | transformer | 2 | 3 | 17 |
| 015 | Mistral-Small-3.1-24B-Instruct-2503 | Mistral | — | — | 2 | 1 | 9 |
| 016 | gemma-3-12b-it | — | — | 2 | 1 | 9 | |
| 017 | Gemini-3.0-Pro-Preview | — | — | 2 | 1 | 7 | |
| 018 | gemini-2.0-flash-001 | — | — | 2 | 1 | 5 | |
| 019 | NV-Embed-v2 | NVIDIA | 7B | Mistral-7B (LLM-based embedding) | 2 | 2 | 3 |
| 020 | ALBERT ensemble | — | — | — | 2 | 2 | 2 |
| 021 | Qwen3-235B-A22B | Alibaba | 235B (22B active) | moe | 1 | 13 | 21 |
| 022 | GLM-5 | Zhipu AI | 130B | — | 1 | 9 | 19 |
| 023 | Qwen/Qwen2.5-14B-Instruct | Qwen | 14.8B | — | 1 | 3 | 17 |
| 024 | Qwen/Qwen2.5-72B-Instruct | Qwen | 72.7B | — | 1 | 3 | 17 |
| 025 | mistralai/Mistral-Large-Instruct-2407 | mistralai | 123B | — | 1 | 3 | 17 |
| 026 | meta-llama/Llama-4-Scout-17B-16E-Instruct (API) | meta-llama | 109B | — | 1 | 2 | 16 |
| 027 | Meta-Llama-3.1-405B-Instruct-FP8 | meta-llama | — | — | 1 | 2 | 12 |
| 028 | internlm2-1_8b | internlm | — | — | 1 | 1 | 12 |
| 029 | Bielik-11B-v3.0-Instruct.Q4_K_M.gguf | gguf11bv30 | — | — | 1 | 1 | 11 |
| 030 | Qwen2.5-32B | Qwen | — | — | 1 | 1 | 11 |
| 031 | b11t2 | 347yth03847tyhy03847yt | — | — | 1 | 1 | 11 |
| 032 | Gemini 2.5 Pro | — | — | — | 1 | 9 | 9 |
| 033 | Gemma-2-27b-it | — | — | 1 | 1 | 9 | |
| 034 | LLaMA-65B | — | — | — | 1 | 9 | 9 |
| 035 | Mistral-Large-Instruct-2407 | Mistral | — | — | 1 | 1 | 9 |
| 036 | Mistral-Small-24B-Instruct-2501 | Mistral | — | — | 1 | 1 | 9 |
| 037 | Mistral-Small-Instruct-2409 | Mistral | — | — | 1 | 1 | 9 |
| 038 | Qwen2.5-32B-Instruct | Alibaba | — | — | 1 | 1 | 9 |
| 039 | aya-expanse-32b | Unknown | — | — | 1 | 1 | 9 |
| 040 | Kimi K2.6 | — | — | — | 1 | 6 | 6 |
| 041 | Llama 2 70B (5-shot) | — | — | — | 1 | 6 | 6 |
| 042 | MiniMax-Text-01 | MiniMax | — | — | 1 | 6 | 6 |
| 043 | Qwen/Qwen3.5-27B thinking (API) | Qwen | 27B | — | 1 | 1 | 5 |
| 044 | Qwen/Qwen3.5-35B-A3B thinking (API) | Qwen | 35B | — | 1 | 1 | 5 |
| 045 | deepseek-ai/DeepSeek-V3.2 (API) | deepseek-ai | 685B | — | 1 | 1 | 5 |
| 046 | GTE-Qwen2-7B-instruct | Alibaba | 7B | Qwen2-7B (LLM-based embedding) | 1 | 3 | 4 |
| 047 | ByT5 XXL | — | — | — | 1 | 2 | 2 |
| 048 | GLiNER-multitask | Knowledgator | Unknown | DeBERTa-based generalist IE model | 1 | 1 | 1 |
| 049 | ModernBERT (large) | — | — | — | 1 | 1 | 1 |
| 050 | QZhou-Embedding | — | — | — | 1 | 1 | 1 |