Computer Visionimage-to-video

Image-to-Video

Image-to-video generation animates a single still image into a coherent video sequence — one of the hardest generation tasks because it demands both visual fidelity and temporal consistency. Stable Video Diffusion (2023) proved that fine-tuning image diffusion models on video data produces remarkably stable motion, and Runway's Gen-3 and Kling showed commercial viability. The key challenge remains physics-aware motion: objects should move naturally, lighting should evolve consistently, and the camera should behave like a real one. A cornerstone of the emerging AI filmmaking pipeline.

1
Datasets
0
Results
composite
Canonical metric
Canonical Benchmark

I2VBench

Evaluates image-to-video generation quality and consistency

Primary metric: composite
View full leaderboard

Top 10

Leading models on I2VBench.

No results yet. Be the first to contribute.

All datasets

1 dataset tracked for this task.

Related tasks

Other tasks in Computer Vision.

Run Inference

Looking to run a model? HuggingFace hosts inference for this task type.

HuggingFace