AndroidWorld is an environment and benchmark for autonomous agents, featuring 116 diverse tasks across 20 real-world apps, dynamic task instantiation for millions of unique variations, and durable reward signals for reliable evaluation. It is an open environment with access to millions of Android apps and websites, and has a lightweight footprint (2 GB memory, 8 GB disk).
No results indexed yet — be the first to submit a score.
Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.