Mapillary Vistas - Short Review

Image Tools

Mapillary Vistas Dataset Overview

The Mapillary Vistas Dataset is a comprehensive and diverse street-level image dataset designed to advance the development of computer vision algorithms, particularly in the fields of semantic segmentation, instance-specific segmentation, and autonomous driving.

Purpose and Application

The primary goal of the Mapillary Vistas Dataset is to provide a rich and varied set of annotated images to train and evaluate machine learning models for understanding street scenes. This dataset is invaluable for researchers, developers, and companies working on autonomous vehicles, smart cities, geospatial services, and other applications requiring detailed street-level imagery analysis.

Key Features

Large-Scale and Diverse: The dataset contains 25,000 high-resolution images captured from around the world, covering various geographic locations, weather conditions, seasons, and times of day. Images are taken using different devices such as mobile phones, tablets, action cameras, and professional capturing rigs, ensuring a broad range of perspectives and qualities.
Detailed Annotations: The images are annotated into 66 object categories, with additional instance-specific labels for 37 classes. Annotations are performed using polygons to delineate individual objects, providing dense and fine-grained detail. This level of annotation is 5 times larger than the total fine annotations available in the Cityscapes dataset.
Extended Annotations: An extension of the dataset introduced over 60 new object classes, including more granular annotations for road markings, driveways, barriers, and signage. A significant addition is the inclusion of traffic light states, with 100,000 annotations of different states such as red, green, yellow, direction, and off.
Global Coverage: The dataset covers images from 190 countries, making it the most diverse street-level image dataset in terms of geographic extent. This global coverage helps in capturing the broad range of outdoor scenes and object appearances worldwide.

Functionality

Semantic and Instance-Specific Segmentation: The dataset is designed for training top-of-class semantic segmentation models and instance-specific segmentation models. The pixel-wise and instance-specific annotations enable the development of advanced algorithms for visual road-scene understanding.
Autonomous Driving and GIS: The dataset is particularly useful for autonomous driving applications, providing detailed annotations essential for building navigation systems, HD maps, and enhancing the perception capacity of autonomous vehicles. It also supports GIS and mapping services by extracting diverse map data from images.
Research and Commercial Use: The dataset is available for both academic and commercial use, with a commercial edition offering annotations for 100 object classes and instance-specific annotations for 58 classes. This makes it a valuable resource for both researchers and industry professionals.

Additional Capabilities

Privacy Protection: Mapillary’s technology includes features to protect privacy by blurring faces and license plates detected in the images.
3D Reconstruction and Traffic Sign Recognition: While not part of the dataset itself, Mapillary’s platform uses these images to perform 3D reconstruction and traffic sign recognition, further enhancing the utility of the dataset in real-world applications.

The Mapillary Vistas Dataset stands out as a premier resource for anyone involved in developing advanced computer vision algorithms, especially those focused on street-level imagery and autonomous driving. Its extensive annotations, global coverage, and diverse image set make it an indispensable tool for research and industry applications.