Hardware Benchmarks

GPU Performance
for ML Workloads

Real-world benchmarks comparing RTX 3090, 4090, and 5090for LLM inference, image generation, and model training.

View Benchmarks Compare Specs

5090 vs 3090

~3x

Faster LLM Inference

+33%

More VRAM (32GB)

1.9x

Memory Bandwidth

Specifications

Spec	RTX 3090	RTX 4090	RTX 5090
Architecture	Ampere	Ada Lovelace	Blackwell
VRAM	24 GB	24 GB	32 GB
Memory Bandwidth	936 GB/s	1008 GB/s	1792 GB/s
FP16 Performance	35.6 TFLOPS	82.6 TFLOPS	105 TFLOPS
TDP	350W	450W	575W
MSRP	$1499	$1599	$1999
Released	2020	2022	2025

ML Benchmarks

LLM Inference

Benchmark	RTX 3090	RTX 4090	RTX 5090	vs 3090
Llama 3 8B (tokens/sec)	45	95	140	3.1x
Llama 3 70B 4-bit (tokens/sec)	8	22	38	4.8x
Mistral 7B (tokens/sec)	52	110	165	3.2x

Image Generation

Benchmark	RTX 3090	RTX 4090	RTX 5090	vs 3090
SDXL 1024x1024 (it/s)	1.8	4.2	6.5	3.6x
Flux.1 Dev (it/s)	0.9	2.1	3.4	3.8x
SD 1.5 512x512 (it/s)	12	28	42	3.5x

Training

Benchmark	RTX 3090	RTX 4090	RTX 5090	vs 3090
Fine-tune Llama 3 8B LoRA (samples/sec)	3.2	7.8	12.5	3.9x
YOLO11x training (images/sec)	45	105	160	3.6x
ResNet-50 ImageNet (images/sec)	850	1950	2800	3.3x

Computer Vision

Benchmark	RTX 3090	RTX 4090	RTX 5090	vs 3090
YOLOv8x inference (FPS)	95	210	320	3.4x
SAM ViT-H (masks/sec)	2.5	5.8	9.2	3.7x

Video

Benchmark	RTX 3090	RTX 4090	RTX 5090	vs 3090
Whisper Large v3 (x realtime)	8	18	28	3.5x

VRAM Requirements

Approximate VRAM needed to run popular models at full precision or with quantization.

Fits on 24GB (3090/4090)

Llama 3 8B (FP16)~16 GB
Llama 3 70B (4-bit)~20 GB
SDXL + LoRA training~18 GB
Mistral 7B (FP16)~14 GB
Flux.1 Dev~22 GB

Needs 32GB (5090)

Llama 3 70B (8-bit)~28 GB
Mixtral 8x7B (FP16)~26 GB
SDXL full fine-tune~28 GB
Large batch training~30 GB
Video generation (Mochi)~26 GB

Which GPU Should You Get?

RTX 3090

Best Value

Still capable for most tasks. Great for hobbyists and those on a budget. Available used at significant discounts.

+ Best price/performance used
+ 24GB handles most models
- Older architecture
- Power hungry (350W)

RTX 4090

Best Overall

The current sweet spot. 2x faster than 3090 with better efficiency. Widely available and proven reliable.

+ Excellent performance
+ Better power efficiency
+ Mature ecosystem
- Still 24GB VRAM limit

RTX 5090

Best Performance

The new king. 32GB VRAM opens up larger models. Best for professionals and serious researchers.

+ 32GB VRAM (finally!)
+ ~50% faster than 4090
+ Latest architecture
- Higher price ($1999)
- High power (575W)

Cloud GPU Pricing

Approximate hourly rates from popular providers. Prices vary by region and availability.

Provider	RTX 3090	RTX 4090	RTX 5090
RunPod	$0.22/hr	$0.39/hr	~$0.69/hr
vast.ai (spot)	$0.15/hr	$0.30/hr	TBD
Lambda Labs	-	$0.50/hr	TBD

* Prices as of December 2024. RTX 5090 pricing estimated based on launch pricing.

Benchmark Methodology

Benchmarks are collected from community sources, manufacturer data, and independent testing. Results may vary based on driver versions, cooling, and system configuration. RTX 5090 numbers are based on early benchmarks and official specs - expect updates as more data becomes available.

GPU Performancefor ML Workloads