GPU Performance
for ML Workloads
Real-world benchmarks comparing RTX 3090, 4090, and 5090 for LLM inference, image generation, and model training.
5090 vs 3090
Specifications
| Spec | RTX 3090 | RTX 4090 | RTX 5090 |
|---|---|---|---|
| Architecture | Ampere | Ada Lovelace | Blackwell |
| VRAM | 24 GB | 24 GB | 32 GB |
| Memory Bandwidth | 936 GB/s | 1008 GB/s | 1792 GB/s |
| FP16 Performance | 35.6 TFLOPS | 82.6 TFLOPS | 105 TFLOPS |
| TDP | 350W | 450W | 575W |
| MSRP | $1499 | $1599 | $1999 |
| Released | 2020 | 2022 | 2025 |
ML Benchmarks
LLM Inference
| Benchmark | RTX 3090 | RTX 4090 | RTX 5090 | vs 3090 |
|---|---|---|---|---|
| Llama 3 8B (tokens/sec) | 45 | 95 | 140 | 3.1x |
| Llama 3 70B 4-bit (tokens/sec) | 8 | 22 | 38 | 4.8x |
| Mistral 7B (tokens/sec) | 52 | 110 | 165 | 3.2x |
Image Generation
| Benchmark | RTX 3090 | RTX 4090 | RTX 5090 | vs 3090 |
|---|---|---|---|---|
| SDXL 1024x1024 (it/s) | 1.8 | 4.2 | 6.5 | 3.6x |
| Flux.1 Dev (it/s) | 0.9 | 2.1 | 3.4 | 3.8x |
| SD 1.5 512x512 (it/s) | 12 | 28 | 42 | 3.5x |
Training
| Benchmark | RTX 3090 | RTX 4090 | RTX 5090 | vs 3090 |
|---|---|---|---|---|
| Fine-tune Llama 3 8B LoRA (samples/sec) | 3.2 | 7.8 | 12.5 | 3.9x |
| YOLO11x training (images/sec) | 45 | 105 | 160 | 3.6x |
| ResNet-50 ImageNet (images/sec) | 850 | 1950 | 2800 | 3.3x |
Computer Vision
| Benchmark | RTX 3090 | RTX 4090 | RTX 5090 | vs 3090 |
|---|---|---|---|---|
| YOLOv8x inference (FPS) | 95 | 210 | 320 | 3.4x |
| SAM ViT-H (masks/sec) | 2.5 | 5.8 | 9.2 | 3.7x |
Video
| Benchmark | RTX 3090 | RTX 4090 | RTX 5090 | vs 3090 |
|---|---|---|---|---|
| Whisper Large v3 (x realtime) | 8 | 18 | 28 | 3.5x |
VRAM Requirements
Approximate VRAM needed to run popular models at full precision or with quantization.
Fits on 24GB (3090/4090)
- Llama 3 8B (FP16) ~16 GB
- Llama 3 70B (4-bit) ~20 GB
- SDXL + LoRA training ~18 GB
- Mistral 7B (FP16) ~14 GB
- Flux.1 Dev ~22 GB
Needs 32GB (5090)
- Llama 3 70B (8-bit) ~28 GB
- Mixtral 8x7B (FP16) ~26 GB
- SDXL full fine-tune ~28 GB
- Large batch training ~30 GB
- Video generation (Mochi) ~26 GB
Which GPU Should You Get?
RTX 3090
Best Value
Still capable for most tasks. Great for hobbyists and those on a budget. Available used at significant discounts.
- + Best price/performance used
- + 24GB handles most models
- - Older architecture
- - Power hungry (350W)
RTX 4090
Best Overall
The current sweet spot. 2x faster than 3090 with better efficiency. Widely available and proven reliable.
- + Excellent performance
- + Better power efficiency
- + Mature ecosystem
- - Still 24GB VRAM limit
RTX 5090
Best Performance
The new king. 32GB VRAM opens up larger models. Best for professionals and serious researchers.
- + 32GB VRAM (finally!)
- + ~50% faster than 4090
- + Latest architecture
- - Higher price ($1999)
- - High power (575W)
Cloud GPU Pricing
Approximate hourly rates from popular providers. Prices vary by region and availability.
| Provider | RTX 3090 | RTX 4090 | RTX 5090 |
|---|---|---|---|
| RunPod | $0.22/hr | $0.39/hr | ~$0.69/hr |
| vast.ai (spot) | $0.15/hr | $0.30/hr | TBD |
| Lambda Labs | - | $0.50/hr | TBD |
* Prices as of December 2024. RTX 5090 pricing estimated based on launch pricing.
Benchmark Methodology
Benchmarks are collected from community sources, manufacturer data, and independent testing. Results may vary based on driver versions, cooling, and system configuration. RTX 5090 numbers are based on early benchmarks and official specs - expect updates as more data becomes available.