Semantic Scholar Open Access 2021 165 sitasi

Translating Images into Maps

Avishkar Saha Oscar Alejandro Mendez Maldonado Chris Russell R. Bowden

Abstrak

We approach instantaneous mapping, converting images to a top-down view of the world, as a translation problem. We show how a novel form of transformer network can be used to map from images and video directly to an overhead map or bird's-eye-view (BEV) of the world, in a single end-to-end network. We assume a 1–1 correspondence between a vertical scanline in the image, and rays passing through the camera location in an overhead map. This lets us formulate map generation from an image as a set of sequence-to-sequence translations. Posing the problem as translation allows the network to use the context of the image when interpreting the role of each pixel. This constrained formulation, based upon a strong physical grounding of the problem, leads to a restricted transformer network that is convolutional in the horizontal direction only. The structure allows us to make efficient use of data when training, and obtains state-of-the-art results for instantaneous mapping of three large-scale datasets, including a 15% and 30% relative gain against existing best performing methods on the nuScenes and Argoverse datasets, respectively.

Topik & Kata Kunci

Penulis (4)

A

Avishkar Saha

O

Oscar Alejandro Mendez Maldonado

C

Chris Russell

R

R. Bowden

Format Sitasi

Saha, A., Maldonado, O.A.M., Russell, C., Bowden, R. (2021). Translating Images into Maps. https://doi.org/10.1109/icra46639.2022.9811901

Akses Cepat

Informasi Jurnal
Tahun Terbit
2021
Bahasa
en
Total Sitasi
165×
Sumber Database
Semantic Scholar
DOI
10.1109/icra46639.2022.9811901
Akses
Open Access ✓