Agentic AI

SWE-bench

Resolving real GitHub issues autonomously.

0 datasets0 results

SWE-bench is a key task in agentic ai. Below you will find the standard benchmarks used to evaluate models, along with current state-of-the-art results.

Benchmarks & SOTA

No datasets indexed for this task yet.

Contribute on GitHub

Related Tasks

SWE-bench Benchmarks - Agentic AI - CodeSOTA | CodeSOTA