Atari Games2013n/a
Arcade Learning Environment (Atari 2600)
Suite of 57 Atari 2600 games. Standard benchmark for deep reinforcement learning agents.
Metrics:human-normalized-score
Paper / WebsiteCurrent State of the Art
Go-Explore
Uber AI
40000
human-normalized-score
human-normalized-score Progress Over Time
Showing 4 breakthroughs from Jul 2012 to Dec 2025
Key Milestones
Mar 2020
Agent57
Median HNS across 57 games. First to beat human baseline on ALL games.
4731.3
+1948.2%
Dec 2025
Go-ExploreCurrent SOTA
Exploration-focused agent. Score is Mean HNS (skewed by Montezuma's Revenge), not Median.
40000.0
+745.4%
Total Improvement
39900.0%
Time Span
13y 8m
Breakthroughs
4
Current SOTA
40000.0
Top Models Performance Comparison
Top 9 models ranked by human-normalized-score
Best Score
40000
Top Model
Go-Explore
Models Compared
9
Score Range
39921
human-normalized-scorePrimary
| # | Model | Score | Paper / Code | Date |
|---|---|---|---|---|
| 1 | Go-ExploreOpen Source Uber AI | 40000 | Dec 2025 | |
| 2 | Agent57Open Source DeepMind | 4731.3 | Dec 2025 | |
| 3 | BBOS-1Open Source | 1100 | Dec 2025 | |
| 4 | GDI-H3Open Source Research | 950 | Dec 2025 | |
| 5 | DreamerV3Open Source DeepMind | 840 | Dec 2025 | |
| 6 | MuZeroOpen Source DeepMind | 731 | Dec 2025 | |
| 7 | Rainbow DQNOpen Source DeepMind | 231 | Dec 2025 | |
| 8 | Human Professional Biology | 100 | Dec 2025 | |
| 9 | DQN (Human-level)Open Source DeepMind | 79 | Dec 2025 |