Every model, measured.

Start with a research area, drill into a vendor, or page through the full index. Only models with at least one benchmark score appear — a model without a recorded score can’t be ranked.

Vendor:Areas overview speakleash · 253 OpenAI · 85 Google · 71 Qwen · 52 Alibaba · 47 Anthropic · 44 Microsoft · 35 Meta · 30 Mistral · 30 DeepSeek · 28 google · 19 meta-llama · 19 mistralai · 19 Meta AI · 15 CYFRAGOVPL · 14 Zhipu AI · 13 NVIDIA · 10 SpeakLeash · 10 internlm · 10 xAI · 10 ByteDance · 9 Baidu · 8 PLLuM · 8 ibm-granite · 8 microsoft · 8 Amazon · 7 Google DeepMind · 7 MiniMax · 7 Mistral AI · 7 Remek · 7 Shanghai AI Lab · 7 allenai · 7 utter-project · 7 CohereForAI · 6 Microsoft Research · 6 Salesforce · 6 01-ai · 5 Alibaba Cloud · 5 Cohere · 5 Moonshot AI · 5 NousResearch · 5 THUML · 5 deepseek-ai · 5 DeepMind · 4 Facebook AI · 4 IBM · 4 Meituan · 4 Stanford · 4 THUDM · 4 UC San Diego · 4 VikParuchuri · 4 gguf-iq · 4 nvidia · 4 openchat · 4 tiiuae · 4 Allen AI · 3 BAAI · 3 Du et al. · 3 ForgeCode · 3 Fudan University · 3 IDEA Research · 3 Liao et al. · 3 Moonshot.AI · 3 Nam Tuan Ly / NII · 3 OPI-PG · 3 OpenDataLab · 3 ViCoS Lab Ljubljana · 3 Xiaomi · 3 Zhao et al. · 3 gguf · 3 gguf11bv30 · 3 gguf7bv30 · 3 upstage · 3+ 247 smaller vendors (291 models)

§ 01 · Reinforcement Learning models

20 models in Reinforcement Learning · page 1 of 1.

#	Model	Vendor	Parameters	Architecture	SOTA	Benchmarks	Results
001	Go-Explore	Uber AI	—	Exploration RL	1	1	1
002	TD-MPC2 (317M params)	UC San Diego	—	—	1	1	1
003	DreamerV3	Google DeepMind	—	World Model (Model-Based)	—	2	2
004	Agent57	DeepMind	—	Distributed RL (Recurrent + Episodic Memory)	—	1	1
005	BBOS-1	Unknown	—	Model-Based RL	—	1	1
006	BRO	DeepMind / TU Warsaw	—	—	—	1	1
007	DQN (Human-level)	DeepMind	—	Deep Q-Network (CNN)	—	1	1
008	Disco57	Google DeepMind	—	DiscoRL — meta-learned RL update rule (discovered by automated search)	—	1	1
009	DrQ-v2	NYU / Google	—	—	—	1	1
010	FOWM	CMU	—	—	—	1	1
011	GDI-H3	Research	—	Model-Based RL	—	1	1
012	Human Professional	Biology	—	Biological Neural Network	—	1	1
013	LBC	Tsinghua University / Baidu	—	Learnable Behavior Control (distributed off-policy actor-critic)	—	1	1
014	MEME	Google DeepMind	—	Memory-Based Exploration Agent (Agent57 variant)	—	1	1
015	MuZero	DeepMind	—	Model-Based RL	—	1	1
016	Rainbow DQN	DeepMind	—	DQN Variant	—	1	1
017	SAC (state-based)	UC Berkeley	—	—	—	1	1
018	TD-MPC	UC San Diego	—	—	—	1	1
019	TD-MPC2 (19M params)	UC San Diego	—	—	—	1	1
020	TD-MPC2 (5M params)	UC San Diego	—	—	—	1	1