Cityscapes is a large-scale dataset for semantic urban scene understanding. It provides high-quality pixel-level (fine) annotations for 5,000 images and coarse annotations for 20,000 images captured across 50 cities. The dataset includes dense semantic segmentation (30 classes), instance segmentation for vehicles and people, stereo pairs, preceding/trailing video frames, and rich metadata (GPS, vehicle odometry). It is used as a benchmark for pixel-level, instance-level, and panoptic semantic labeling.
No results indexed yet — be the first to submit a score.
Submit a checkpoint and a reproduction script. We will run it, publish the score, and — if it takes the top — annotate the step on the progress chart with your name.