arXiv Open Access 2022

Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories

Fábio Vital Miguel Vasco Alberto Sardinha Francisco Melo
Lihat Sumber

Abstrak

We present Perceive-Represent-Generate (PRG), a novel three-stage framework that maps perceptual information of different modalities (e.g., visual or sound), corresponding to a sequence of instructions, to an adequate sequence of movements to be executed by a robot. In the first stage, we perceive and pre-process the given inputs, isolating individual commands from the complete instruction provided by a human user. In the second stage we encode the individual commands into a multimodal latent space, employing a deep generative model. Finally, in the third stage we convert the multimodal latent values into individual trajectories and combine them into a single dynamic movement primitive, allowing its execution in a robotic platform. We evaluate our pipeline in the context of a novel robotic handwriting task, where the robot receives as input a word through different perceptual modalities (e.g., image, sound), and generates the corresponding motion trajectory to write it, creating coherent and readable handwritten words.

Topik & Kata Kunci

Penulis (4)

F

Fábio Vital

M

Miguel Vasco

A

Alberto Sardinha

F

Francisco Melo

Format Sitasi

Vital, F., Vasco, M., Sardinha, A., Melo, F. (2022). Perceive, Represent, Generate: Translating Multimodal Information to Robotic Motion Trajectories. https://arxiv.org/abs/2204.03051

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2022
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓