arXiv Open Access 2022

Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories

Fábio Vital Miguel Vasco Alberto Sardinha Francisco Melo

Lihat Sumber

Abstrak

We present Perceive-Represent-Generate (PRG), a novel three-stage framework that maps perceptual information of different modalities (e.g., visual or sound), corresponding to a sequence of instructions, to an adequate sequence of movements to be executed by a robot. In the first stage, we perceive and pre-process the given inputs, isolating individual commands from the complete instruction provided by a human user. In the second stage we encode the individual commands into a multimodal latent space, employing a deep generative model. Finally, in the third stage we convert the multimodal latent values into individual trajectories and combine them into a single dynamic movement primitive, allowing its execution in a robotic platform. We evaluate our pipeline in the context of a novel robotic handwriting task, where the robot receives as input a word through different perceptual modalities (e.g., image, sound), and generates the corresponding motion trajectory to write it, creating coherent and readable handwritten words.

Topik & Kata Kunci

cs.RO cs.AI cs.LG

Penulis (4)

Fábio Vital

Miguel Vasco

Alberto Sardinha

Francisco Melo

Format Sitasi

APA MLA BibTeX

Vital, F., Vasco, M., Sardinha, A., Melo, F. (2022). Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories. https://arxiv.org/abs/2204.03051

Akses Cepat

Lihat di Sumber

Informasi Jurnal

Tahun Terbit: 2022
Bahasa: en
Sumber Database: arXiv
Akses: Open Access ✓