Arena · Text-to-Image4.3M human votes · 51 models

AI Image Generation Arena:
Text-to-Image Rankings

Elo ratings computed from 4.3 million pairwise human preference votes across 51 text-to-image models. Users were shown two images for the same prompt and asked which they preferred — no labels, no bias. Updated 2025.

Total votes4.3M

Models evaluated51

Top Elo score1266

MethodologyElo rating

🏆

Google leads overall

5 of the top 15 slots are Google models — including #1, #3, #4, #12, and #15.

⚡

Flux 2 dominates mid-tier

Black Forest Labs places 4 Flux 2 variants (ranks 8–11 and 14) in the top 15.

🌱

Apache 2.0 options exist

Qwen-Image-2512 (#18, Elo 1136) and Z-Image-Turbo (#30) are fully open licensed.

Full Leaderboard — Top 30

Showing top 30 of 51 models. Elo scores are based on the standard 400-point scale. CI = 95% confidence interval on the Elo estimate.

Rank	Model	Vendor	Elo	±CI	Votes	License
🥇	gemini-3.1-flash-image-preview	Google	1266	±7	15K	Proprietary
🥈	gpt-image-1.5-high-fidelity	OpenAI	1244	±4	63K	Proprietary
🥉	gemini-3-pro-image-preview-2k	Google	1235	±5	58K	Proprietary
4	gemini-3-pro-image-preview	Google	1232	±5	83K	Proprietary
5	mai-image-2	Microsoft AI	1189	±8	6K	Proprietary
6	reve-v1.5	Reve	1177	±6	8K	Proprietary
7	grok-imagine-image	xAI	1173	±4	49K	Proprietary
8	flux-2-max	Black Forest Labs	1167	±4	66K	Proprietary
9	grok-imagine-image-pro	xAI	1160	±4	48K	Proprietary
10	flux-2-flex	Black Forest Labs	1158	±4	102K	Proprietary
11	flux-2-pro	Black Forest Labs	1157	±4	97K	Proprietary
12	gemini-2.5-flash-image	Google	1154	±3	696K	Proprietary
13	hunyuan-image-3.0	Tencent	1151	±3	173K	Community
14	flux-2-dev	Black Forest Labs	1150	±5	50K	Proprietary
15	imagen-ultra-4.0	Google	1147	±4	390K	Proprietary
16	seedream-4.5	Bytedance	1145	±4	102K	Proprietary
17	seedream-4-2k	Bytedance	1141	±6	13K	Proprietary
18	qwen-image-2512	Alibaba	1136	±4	48K	Apache 2.0
19	wan2.6-t2i	Alibaba	1135	±4	43K	Proprietary
20	imagen-4.0	Google	1133	±3	462K	Proprietary
21	seedream-4-fal	Bytedance	1117	±6	12K	Proprietary
22	wan2.5-t2i-preview	Alibaba	1115	±4	138K	Proprietary
23	gpt-image-1	OpenAI	1115	±3	266K	Proprietary
24	seedream-4-high-res	Bytedance	1114	±4	117K	Proprietary
25	seedream-5.0-lite	Bytedance	1113	±5	21K	Proprietary
26	gpt-image-1-mini	OpenAI	1103	±4	106K	Proprietary
27	recraft-v4	Recraft	1102	±7	14K	Proprietary
28	mai-image-1	Microsoft AI	1093	±4	94K	Proprietary
29	seedream-3	Bytedance	1082	±5	37K	Proprietary
30	z-image-turbo	Alibaba	1076	±6	12K	Apache 2.0

🥇gemini-3.1-flash-image-preview

1266

GoogleProprietary±7 CI15K votes

🥈gpt-image-1.5-high-fidelity

1244

OpenAIProprietary±4 CI63K votes

🥉gemini-3-pro-image-preview-2k

1235

GoogleProprietary±5 CI58K votes

4gemini-3-pro-image-preview

1232

GoogleProprietary±5 CI83K votes

5mai-image-2

1189

Microsoft AIProprietary±8 CI6K votes

6reve-v1.5

1177

ReveProprietary±6 CI8K votes

7grok-imagine-image

1173

xAIProprietary±4 CI49K votes

8flux-2-max

1167

Black Forest LabsProprietary±4 CI66K votes

9grok-imagine-image-pro

1160

xAIProprietary±4 CI48K votes

10flux-2-flex

1158

Black Forest LabsProprietary±4 CI102K votes

11flux-2-pro

1157

Black Forest LabsProprietary±4 CI97K votes

12gemini-2.5-flash-image

1154

GoogleProprietary±3 CI696K votes

13hunyuan-image-3.0

1151

TencentCommunity±3 CI173K votes

14flux-2-dev

1150

Black Forest LabsProprietary±5 CI50K votes

15imagen-ultra-4.0

1147

GoogleProprietary±4 CI390K votes

16seedream-4.5

1145

BytedanceProprietary±4 CI102K votes

17seedream-4-2k

1141

BytedanceProprietary±6 CI13K votes

18qwen-image-2512

1136

AlibabaApache 2.0±4 CI48K votes

19wan2.6-t2i

1135

AlibabaProprietary±4 CI43K votes

20imagen-4.0

1133

GoogleProprietary±3 CI462K votes

21seedream-4-fal

1117

BytedanceProprietary±6 CI12K votes

22wan2.5-t2i-preview

1115

AlibabaProprietary±4 CI138K votes

23gpt-image-1

1115

OpenAIProprietary±3 CI266K votes

24seedream-4-high-res

1114

BytedanceProprietary±4 CI117K votes

25seedream-5.0-lite

1113

BytedanceProprietary±5 CI21K votes

26gpt-image-1-mini

1103

OpenAIProprietary±4 CI106K votes

27recraft-v4

1102

RecraftProprietary±7 CI14K votes

28mai-image-1

1093

Microsoft AIProprietary±4 CI94K votes

29seedream-3

1082

BytedanceProprietary±5 CI37K votes

30z-image-turbo

1076

AlibabaApache 2.0±6 CI12K votes

Elo Score Distribution — Top 30

1gemini-3.1-flash-image-preview

1266

2gpt-image-1.5-high-fidelity

1244

3gemini-3-pro-image-preview-2k

1235

4gemini-3-pro-image-preview

1232

5mai-image-2

1189

6reve-v1.5

1177

7grok-imagine-image

1173

8flux-2-max

1167

9grok-imagine-image-pro

1160

10flux-2-flex

1158

11flux-2-pro

1157

12gemini-2.5-flash-image

1154

13hunyuan-image-3.0

1151

14flux-2-dev

1150

15imagen-ultra-4.0

1147

16seedream-4.5

1145

17seedream-4-2k

1141

18qwen-image-2512

1136

19wan2.6-t2i

1135

20imagen-4.0

1133

21seedream-4-fal

1117

22wan2.5-t2i-preview

1115

23gpt-image-1

1115

24seedream-4-high-res

1114

25seedream-5.0-lite

1113

26gpt-image-1-mini

1103

27recraft-v4

1102

28mai-image-1

1093

29seedream-3

1082

30z-image-turbo

1076

Google vs Everyone

Google's multimodal image generation capabilities are unmatched at the top of the leaderboard. Gemini 3.1 Flash Image Preview achieves an Elo of 1266 — a full 22 points ahead of OpenAI's GPT-Image-1.5 in second place.

What makes this more striking is range: Google holds 5 distinct leaderboard positions in the top 20, spanning Gemini 3.1 Flash (#1), Gemini 3 Pro (#3, #4), Gemini 2.5 Flash (#12), and Imagen Ultra 4.0 (#15). No other vendor comes close to this breadth.

Google models in top 20

gemini-3.1-flash-image-preview1266

gemini-3-pro-image-preview-2k1235

gemini-3-pro-image-preview1232

gemini-2.5-flash-image1154

imagen-ultra-4.01147

imagen-4.01133

Black Forest Labs: Flux 2 Sweep

Black Forest Labs launched the original Flux in 2024 and immediately disrupted the open-weight image generation market. Flux 2 consolidates that lead: four variants cluster tightly between Elo 1150–1167, ranking 8th through 14th.

The tight clustering (17-point spread across 5 models) suggests BFL has found a capability ceiling with the current architecture. Flux 2 Max (Elo 1167) edges ahead on quality at the cost of inference speed, while Flux 2 Dev (1150) provides open-weights access for local deployment.

Flux 2 variants

flux-2-max1167

flux-2-flex1158

flux-2-pro1157

flux-2-dev1150

Bytedance Seedream: The Rising Force

Bytedance's Seedream family has quietly become one of the most-tested model families in the arena. Six variants appear in the top 30, accumulating over 300K combined votes — a sign of significant deployment and user interest.

Seedream 4.5 (Elo 1145, rank 16) leads the family, sitting just above Imagen 4.0 and outpacing all OpenAI standard-tier models. The progression from Seedream 3 (1082) to Seedream 4.5 (1145) represents a 63-point Elo gain in a single generation cycle — rapid improvement by any measure.

Seedream family progress

seedream-4.5

102K votes1145

seedream-4-2k

13K votes1141

seedream-4-fal

12K votes1117

seedream-4-high-res

117K votes1114

seedream-5.0-lite

21K votes1113

seedream-3

37K votes1082

Open-Source & Permissive Options

For teams that cannot use proprietary APIs — due to data privacy, cost, or licensing — two models stand out with permissive Apache 2.0 licenses:

qwen-image-25121136

Alibaba · Rank #18 · Apache 2.0 · 48K votes

The highest-ranked Apache 2.0 model. Commercially usable with no restrictions. Competitive with mid-tier proprietary models, outperforming GPT-Image-1 standard tier.

z-image-turbo1076

Alibaba · Rank #30 · Apache 2.0 · 12K votes

A turbo-class permissive model. Lower vote count suggests it is newer; headroom for further Elo movement as more comparisons accumulate.

hunyuan-image-3.01151

Tencent · Rank #13 · Community · 173K votes

The community-licensed alternative with the most votes of any non-proprietary model. Strong photorealism, widely used in Chinese-market deployments.

Methodology

Blind Pairwise Voting

Users are shown two images generated from the same prompt by two different models, with no labels. They select which image they prefer. This blind comparison eliminates brand bias.

Elo Rating System

Elo ratings update after each vote based on expected vs actual outcome. The system is the same algorithm used in chess, adapted for multi-model comparison. Starting Elo is 1000 for all models.

Confidence Intervals

The ±CI values are 95% bootstrap confidence intervals. Models with fewer votes (e.g. mai-image-2 at 6K) have wider intervals (±8) than heavily-voted models like gemini-2.5-flash-image (696K votes, ±3).

Prompt Coverage

Prompts span photography, illustration, concept art, product design, portraits, landscapes, and abstract art. The distribution is crowd-sourced from real users rather than curated, reflecting actual use cases.

What Elo Measures

Elo captures human preference — not technical quality metrics like FID or IS. A model may have lower FID but higher Elo if humans simply prefer its aesthetic output. Both matter for different use cases.

Data Source

Leaderboard data is sourced from the LMSYS Chatbot Arena / Imgen Arena project, which runs continuous crowd-sourced evaluations. Scores represent a snapshot with 4.3M total votes across 51 models.

Frequently Asked Questions

What is the best AI image generator in 2026?

Based on 4.3M human preference votes, Gemini 3.1 Flash Image Preview (Elo 1266) leads the text-to-image arena, followed by GPT-Image-1.5 High Fidelity (1244) and Gemini 3 Pro Image Preview (1235). For open-source options, Qwen-Image-2512 (1136, Apache 2.0) is the top freely licensed model.

How does the text-to-image arena work?

The arena shows users two generated images from different models for the same prompt, and asks which they prefer. Elo ratings are computed from these head-to-head pairwise comparisons — the same system used in chess rankings. A higher Elo means humans consistently prefer that model's images.

Is Flux 2 better than Stable Diffusion?

Yes. Flux 2 from Black Forest Labs substantially outperforms legacy Stable Diffusion. Flux 2 Max (Elo 1167), Flux 2 Flex (1158), Flux 2 Pro (1157), and Flux 2 Dev (1150) all rank in the top 15 globally, making BFL the top open-weight image generation lab by breadth of strong models.

What is the best open-source text-to-image model?

Qwen-Image-2512 by Alibaba (Elo 1136) is the top Apache 2.0 licensed text-to-image model, ranking 18th overall. Z-Image-Turbo (Elo 1076, also Apache 2.0) is another permissively licensed option. Flux 2 models are also available as open weights though under a non-commercial license.

How does Google's Imagen compare to OpenAI's GPT-Image?

Google dominates the top of the leaderboard. Gemini 3.1 Flash Image Preview (Elo 1266) and Gemini 3 Pro variants (1235, 1232) rank 1st, 3rd, and 4th. OpenAI's GPT-Image-1.5 High Fidelity (1244) takes 2nd, and Imagen Ultra 4.0 (1147) and Imagen 4.0 (1133) round out Google's strong showing across all tiers.

What is Seedream by Bytedance?

Seedream is Bytedance's family of text-to-image models. Seedream 4.5 (Elo 1145) leads the family at rank 16, with multiple variants (4-2k, 4-fal, 4-high-res, 5.0-lite, and Seedream 3) all placing in the top 30. This makes Bytedance one of the most prolific image generation labs in the arena.

Vendor Summary

Google

6 models in top 30

Top Elo: 1266

Bytedance

6 models in top 30

Top Elo: 1145

Black Forest Labs

4 models in top 30

Top Elo: 1167

Alibaba

3 models in top 30

Top Elo: 1136

OpenAI

3 models in top 30

Top Elo: 1244

xAI

2 models in top 30

Top Elo: 1173

Microsoft AI

2 models in top 30

Top Elo: 1189

Reve

1 model in top 30

Top Elo: 1177

Tencent

1 model in top 30

Top Elo: 1151

Recraft

1 model in top 30

Top Elo: 1102

AI Image Generation Arena: Text-to-Image Rankings

Full Leaderboard — Top 30

Elo Score Distribution — Top 30

Google vs Everyone

Black Forest Labs: Flux 2 Sweep

Bytedance Seedream: The Rising Force

Open-Source & Permissive Options

Methodology

Blind Pairwise Voting

Elo Rating System

Confidence Intervals

Prompt Coverage

What Elo Measures

Data Source

Frequently Asked Questions

What is the best AI image generator in 2026?

How does the text-to-image arena work?

Is Flux 2 better than Stable Diffusion?

What is the best open-source text-to-image model?

How does Google's Imagen compare to OpenAI's GPT-Image?

What is Seedream by Bytedance?

Vendor Summary

AI Image Generation Arena:
Text-to-Image Rankings