Aider vs Claude Code.

Aider is a 15,000-line Python REPL that turns any LLM into a pair-programmer. Claude Code is Anthropic's closed-source CLI purpose-built for Claude. Different philosophies, overlapping use cases.

SWE-Bench hub →Aider home ↗Claude Code ↗

§ 01 · Side-by-side

How they compare, row by row.

Attribute	Aider	Claude Code
Vendor	Open source community (Paul Gauthier)	Anthropic
Surface	Terminal REPL (Python, ~15k LOC)	Terminal CLI + IDE plugins
License	Apache 2.0	Proprietary
Model choice	Any LLM — Claude, GPT, DeepSeek, Qwen, local	Claude only (Opus / Sonnet / Haiku)
Edit format	Unified diff (auto-commit per edit)	str_replace surgical edit
SWE-Bench Verified	~55% (Aider + Opus 4.5)	80.9% (Opus 4.5) / 87.6% (Opus 4.7)
Aider Polyglot	88.4% (Opus 4.5) / 82% (Sonnet 4.5)	Not reported
Workflow	Git-native; every change committed	Plan-act-reflect agent loop
Autonomy	Tight REPL — diff confirm	Minute-scale fire-and-forget
Cost (Opus 4.5)	~$2.40 per resolve	~$4.80 per resolve
Cost (cheapest)	$0.04 (local DeepSeek self-host)	$0.35 (Haiku 4.5)
Best for	Open weights, local models, tiny surface area	SWE-Bench topline, MCP, long autonomy

pass@1, higher is better

Aider vs Claude Code — best reported across benchmarks

Closed modelOpen weightsAgent scaffold

Polyglot is Aider's own 225-task multi-language benchmark. SWE-Bench Verified is harder and favors longer agentic loops.

Workflow loops

Aider is a tight REPL around git + diff. Claude Code is a full agent with planning + tests.

Architecture

Aider — diff-based pair-programmer

Any LLM in; unified diffs + auto-commits out

Architecture

Claude Code — full agent loop

Plans, runs tests, iterates

Cost by model + tool

Aider + a cheap open model is ~20x cheaper than Claude Code + Opus. Whether the resolve gap is worth it depends on the task.

The money visual

Aider vs Claude Code — model-agnostic cost/perf

X: $ per resolved issue (log scale). Y: Verified %. Pink line = Pareto frontier.

Closed modelOpen weightsAgent scaffoldPareto frontier

Radar

Aider vs Claude Code — capability profile

Claude Code

Aider

§ 02 · Pick by task

When to pick which.

Claude Code wins when

You want SWE-Bench topline quality
The task needs planning + running tests in a loop
You are fine being locked to Anthropic models
MCP matters to you

Aider wins when

You want open-source everything
You want to run DeepSeek / Qwen / Kimi / local models
You value git-native workflow (auto-commit per edit)
You want to keep the surface area tiny (<15k LOC)
Budget is tight and you can use Sonnet 4.5 or open weights

§ 03 · Method

How the numbers were sourced.

SWE-Bench Verified scores come from Anthropic's leaderboard runs and Paul Gauthier's public Aider benchmarks. Aider Polyglot is a 225-task, multi-language benchmark maintained by the Aider project; numbers are pulled from aider.chat/docs/leaderboards.

Cost figures assume a single full Verified run (or its Polyglot equivalent) at published API rates as of April 2026. Self-host costs amortize a single H100 hour across an average of 25 resolves.

Polyglot and Verified are not directly comparable — Polyglot rewards a tight diff loop, Verified rewards long agentic planning. Where both numbers exist, prefer Verified for production decisions.

§ 04 · Related

Adjacent comparisons.

Claude Code vs Cursor Composer Claude Code vs Codex CLI Devin vs Claude Code OpenHands vs SWE-agent Best agent for SWE-Bench SWE-Bench hub Coding lineage Terminal-Bench