Live ArenaUpdated March 2026

AI Video Generation Arena:Text-to-Video Rankings

Human-preference Elo rankings across 37 text-to-video models, computed from 246,000+ pairwise votes. Veo 3.1 with native audio generation dominates. Sora 2 Pro holds 4th. Open-source models (Kandinsky MIT, Wan Apache 2.0) punch above their weight.

37
Models Ranked
246K+
Human Votes
1381
Top Elo Score
3
Open-Source Models
!

Audio Generation Changes Everything

The gap between Veo 3.1 with audio (Elo 1381) and Veo 3 without audio (Elo 1257) is 124 Elo points — a chasm. All four top-10 Google models include native audio synthesis. Sora 2 Pro (4th, Elo 1367) is the only non-audio model in the top 6, and even it falls below Veo 3.1 audio variants. Audio is now the primary differentiator at the frontier — not motion quality or resolution.

Sora 2 Analysis

Sora 2 Pro (Elo 1367) ranks 4th globally with 18,963 votes — strong statistical confidence. Sora 2 standard (1342, 9th) trails by 25 Elo. OpenAI's video quality is genuine, but without audio generation, both Sora models sit behind five Google variants. The path to #1 runs through native audio.

Sora 2 Pro1367
Sora 21342
vs Veo 3.1 Audio−14

Open-Source Highlights

Kandinsky 5.0 T2V Pro
MIT · Rank #26 · Elo 1179
MIT licensed. Comparable to Runway Gen 4.5 (1214) at no cost.
HunyuanVideo 1.5
Community · Rank #27 · Elo 1171
Tencent's community model. Strong motion quality for local inference.
Wan v2.2 A14B
Apache 2.0 · Rank #30 · Elo 1130
Apache 2.0 — commercial use free. Alibaba's open weight release.

Full Leaderboard

37 models · 246,000+ human votes · Elo-ranked

1veo-3.1-audio-1080p
GoogleProprietary+ Audio
1381
±16 · 5.5K votes
2veo-3.1-fast-audio-1080p
GoogleProprietary+ Audio
1378
±14 · 5.7K votes
3veo-3.1-audio
GoogleProprietary+ Audio
1371
±14 · 12.6K votes
4sora-2-pro
OpenAIProprietary
1367
±9 · 19.0K votes
5veo-3.1-fast-audio
GoogleProprietary+ Audio
1366
±11 · 25.4K votes
6grok-imagine-video-720p
xAIProprietary
1358
±9 · 33.7K votes
7veo-3-fast-audio
GoogleProprietary+ Audio
1351
±11 · 25.8K votes
8wan2.6-t2v
AlibabaProprietary
1347
±17 · 6.4K votes
9sora-2
OpenAIProprietary
1342
±8 · 25.2K votes
10veo-3-audio
GoogleProprietary+ Audio
1341
±12 · 19.3K votes
11wan2.5-t2v-preview
AlibabaProprietary
1268
±17 · 6.1K votes
12veo-3
GoogleProprietary
1257
±11 · 15.2K votes
13seedance-v1.5-pro
BytedanceProprietary
1255
±8 · 31.6K votes
14veo-3-fast
GoogleProprietary
1251
±12 · 15.5K votes
15pixverse-v5.6
PixverseProprietary
1228
±14 · 2.3K votes
16kling-2.5-turbo-1080p
KlingAIProprietary
1221
±17 · 2.1K votes
17kling-2.6-pro
KlingAIProprietary
1219
±8 · 38.7K votes
18runway-gen-4.5
RunwayProprietary
1214
±11 · 3.9K votes
19kling-o1-pro
KlingAIProprietary
1208
±27 · 1.2K votes
20ray-3
Luma AIProprietary
1204
±23 · 1.1K votes
21hailuo-02-pro
MiniMaxProprietary
1200
±12 · 9.9K votes
22hailuo-2.3
MiniMaxProprietary
1196
±8 · 26.8K votes
23seedance-v1-pro
BytedanceProprietary
1192
±11 · 12.9K votes
24hailuo-02-standard
MiniMaxProprietary
1181
±12 · 9.9K votes
25p-video
PrunaProprietary
1180
±15 · 3.6K votes
26kandinsky-5.0-t2v-pro
KandinskyMIT
1179
±21 · 1.9K votes
27hunyuan-video-1.5
TencentCommunity
1171
±16 · 4.1K votes
28kling-v2.1-master
KlingAIProprietary
1168
±9 · 14.5K votes
29veo-2
GoogleProprietary
1166
±16 · 7.1K votes
30wan-v2.2-a14b
AlibabaApache 2.0
1130
±15 · 11.2K votes

Vendor Breakdown

Which companies dominate text-to-video AI in 2026?

Google8 models
veo-3.1-audio-1080p
Best Elo: 1381

Dominant across the board. Audio-enabled Veo 3.1 variants occupy ranks 1–3 and 5. The only company with native audio-video synthesis at the frontier.

OpenAI2 models
sora-2-pro
Best Elo: 1367

Sora 2 Pro holds a strong 4th. High vote count (18K+) gives statistical confidence. Lacks audio generation — which is the main gap vs. Veo.

xAI1 model
grok-imagine-video-720p
Best Elo: 1358

Surprising 6th-place finish with 33K+ votes — the most votes among top-10 models. Solid video quality from a company new to the space.

Alibaba3 models
wan2.6-t2v
Best Elo: 1347

Three models in the leaderboard including one Apache 2.0 open-weight release (Wan v2.2). Strong contender in the mid-to-upper tier.

KlingAI4 models
kling-2.5-turbo-1080p
Best Elo: 1221

Most models in the leaderboard from any single vendor (4). Kling 2.6 Pro has the highest vote count overall at 38K+. Consistent mid-tier performer.

MiniMax3 models
hailuo-02-pro
Best Elo: 1200

Hailuo series (three variants) clusters around Elo 1181–1200. High vote counts signal genuine user engagement. Chinese model to watch.

Frequently Asked Questions

What is the best AI video generator in 2026?+
Google Veo 3.1 with audio (Elo 1381) leads the text-to-video leaderboard in 2026, followed by Veo 3.1 Fast with audio (1378) and Sora 2 Pro (1367). Veo models with integrated audio generation dominate the top rankings by a significant margin over models that generate silent video.
How does Sora 2 compare to Veo 3?+
Sora 2 Pro (Elo 1367) ranks 4th overall, behind three Veo 3.1 variants that include native audio generation. Sora 2 (standard, Elo 1342) ranks 9th. Google's Veo 3 (without audio, Elo 1257) is actually ranked below Sora 2 Pro, showing that audio generation — not just video quality — is the key differentiator at the top.
Are there open-source text-to-video models worth using?+
Yes. Kandinsky 5.0 T2V Pro (Elo 1179) is MIT licensed and ranks 26th overall — competitive with paid APIs. Wan v2.2 A14B (Elo 1130) is Apache 2.0 licensed and ranks 30th. HunyuanVideo 1.5 (Elo 1171) from Tencent is a community-licensed model that performs well. These open models are surprisingly close to mid-tier commercial offerings.
What is an Elo rating in AI video generation?+
Elo scores in video generation arenas are calculated from pairwise human preference votes — viewers compare two generated videos side-by-side and vote for the better one. The Elo system (originally from chess) computes a skill rating from win/loss records. Higher Elo means humans consistently prefer that model's outputs. All ratings here are sourced from the VideoGen-Eval arena with 246,000+ votes across 37 models.
Which company leads AI video generation?+
Google leads AI video generation in 2026 with 8 models in the top 37, including the top 3 spots (Veo 3.1 audio variants). OpenAI's Sora 2 holds 4th place. Chinese companies — Alibaba (Wan), Bytedance (Seedance), MiniMax (Hailuo), KlingAI, and Tencent (HunyuanVideo) — occupy most of the mid-tier, showing strong competition from the Chinese AI ecosystem.
Does audio generation matter for video AI rankings?+
Dramatically yes. The gap between Veo 3.1 with audio (Elo 1381) and Veo 3 without audio (Elo 1257) is 124 Elo points — roughly the difference between a grandmaster and an average club player in chess. Models with native audio generation dominate the top 10, and the gap between audio-capable and silent video models is the largest capability split visible in the 2026 arena data.

Methodology

Rankings are derived from pairwise human preference votes collected in an arena format. Participants are shown two videos generated from the same prompt by two randomly selected models and vote for the one they prefer. The Elo rating system (adapted from competitive chess) computes skill ratings from win/loss outcomes. A model's Elo increases when it defeats higher-rated opponents and decreases when it loses to lower-rated ones.

The 95% confidence interval (±CI) reflects the uncertainty in the Elo estimate given the available vote count. Models with fewer than ~2,000 votes carry wider confidence intervals and should be interpreted with caution. All 37 models in this leaderboard had at least 1,000 votes as of March 2026.

Data source: VideoGen-Eval arena · 246,000+ total votes · 37 models evaluated · Last updated March 2026. CodeSOTA republishes rankings with editorial context; we do not modify the underlying Elo scores.