Aider is a 15,000-line Python REPL that turns any LLM into a pair-programmer. Claude Code is Anthropic's closed-source CLI purpose-built for Claude. Different philosophies, overlapping use cases.
| Attribute | Aider | Claude Code |
|---|---|---|
| Vendor | Open source community (Paul Gauthier) | Anthropic |
| Surface | Terminal REPL (Python, ~15k LOC) | Terminal CLI + IDE plugins |
| License | Apache 2.0 | Proprietary |
| Model choice | Any LLM — Claude, GPT, DeepSeek, Qwen, local | Claude only (Opus / Sonnet / Haiku) |
| Edit format | Unified diff (auto-commit per edit) | str_replace surgical edit |
| SWE-Bench Verified | ~55% (Aider + Opus 4.5) | 80.9% (Opus 4.5) / 87.6% (Opus 4.7) |
| Aider Polyglot | 88.4% (Opus 4.5) / 82% (Sonnet 4.5) | Not reported |
| Workflow | Git-native; every change committed | Plan-act-reflect agent loop |
| Autonomy | Tight REPL — diff confirm | Minute-scale fire-and-forget |
| Cost (Opus 4.5) | ~$2.40 per resolve | ~$4.80 per resolve |
| Cost (cheapest) | $0.04 (local DeepSeek self-host) | $0.35 (Haiku 4.5) |
| Best for | Open weights, local models, tiny surface area | SWE-Bench topline, MCP, long autonomy |
pass@1, higher is better
Polyglot is Aider's own 225-task multi-language benchmark. SWE-Bench Verified is harder and favors longer agentic loops.
Aider is a tight REPL around git + diff. Claude Code is a full agent with planning + tests.
Architecture
Any LLM in; unified diffs + auto-commits out
Architecture
Plans, runs tests, iterates
Aider + a cheap open model is ~20x cheaper than Claude Code + Opus. Whether the resolve gap is worth it depends on the task.
The money visual
X: $ per resolved issue (log scale). Y: Verified %. Pink line = Pareto frontier.
Radar
SWE-Bench Verified scores come from Anthropic's leaderboard runs and Paul Gauthier's public Aider benchmarks. Aider Polyglot is a 225-task, multi-language benchmark maintained by the Aider project; numbers are pulled from aider.chat/docs/leaderboards.
Cost figures assume a single full Verified run (or its Polyglot equivalent) at published API rates as of April 2026. Self-host costs amortize a single H100 hour across an average of 25 resolves.
Polyglot and Verified are not directly comparable — Polyglot rewards a tight diff loop, Verified rewards long agentic planning. Where both numbers exist, prefer Verified for production decisions.