Virtual KITTI 2 is a synthetic, photo-realistic driving dataset that is a revised and expanded version of the original Virtual KITTI. It provides synthetic "clones" of 5 sequences from the KITTI tracking benchmark (scenes: 01, 02, 06, 18, 20) together with multiple variants per sequence (e.g., different weather such as fog and rain, and modified camera configurations such as rotations). For each sequence and variant the dataset supplies multi-modal ground truth: RGB (stereo), dense depth, semantic (class) segmentation, instance segmentation, optical flow, scene flow, camera parameters and poses, and vehicle locations. The dataset was built with improved photorealism using a modern game engine and is intended for tasks such as depth estimation, segmentation, flow, and domain transfer / synthetic-to-real evaluation. The dataset is distributed for non-commercial research use under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 license (copyright Naver Corporation). Reported statistics include a total of about 21,260 stereo pairs across the 5 cloned scenes (scene-level counts reported in the release). Sources: arXiv:2001.10773 (Virtual KITTI 2) and the Naver Labs Europe project page. Also referenced in the paper "Depth Anything" (arXiv:2401.10891) for zero-shot metric depth evaluation (Table 5).
No results indexed yet — be the first to submit a score.
Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.