Semantic Scholar Open Access 2021 165 sitasi

Translating Images into Maps

Avishkar Saha Oscar Alejandro Mendez Maldonado Chris Russell R. Bowden

Lihat Sumber DOI

Abstrak

We approach instantaneous mapping, converting images to a top-down view of the world, as a translation problem. We show how a novel form of transformer network can be used to map from images and video directly to an overhead map or bird's-eye-view (BEV) of the world, in a single end-to-end network. We assume a 1–1 correspondence between a vertical scanline in the image, and rays passing through the camera location in an overhead map. This lets us formulate map generation from an image as a set of sequence-to-sequence translations. Posing the problem as translation allows the network to use the context of the image when interpreting the role of each pixel. This constrained formulation, based upon a strong physical grounding of the problem, leads to a restricted transformer network that is convolutional in the horizontal direction only. The structure allows us to make efficient use of data when training, and obtains state-of-the-art results for instantaneous mapping of three large-scale datasets, including a 15% and 30% relative gain against existing best performing methods on the nuScenes and Argoverse datasets, respectively.

Topik & Kata Kunci

Computer Science

Penulis (4)

Avishkar Saha

Oscar Alejandro Mendez Maldonado

Chris Russell

R. Bowden

Format Sitasi

APA MLA BibTeX

Saha, A., Maldonado, O.A.M., Russell, C., Bowden, R. (2021). Translating Images into Maps. https://doi.org/10.1109/icra46639.2022.9811901

Akses Cepat

Lihat di Sumber doi.org/10.1109/icra46639.2022.9811901

Informasi Jurnal

Tahun Terbit: 2021
Bahasa: en
Total Sitasi: 165×
Sumber Database: Semantic Scholar
DOI: 10.1109/icra46639.2022.9811901
Akses: Open Access ✓