arXiv Open Access 2023

Activity Grammars for Temporal Action Segmentation

Dayoung Gong Joonseok Lee Deunsol Jung Suha Kwak Minsu Cho
Lihat Sumber

Abstrak

Sequence prediction on temporal data requires the ability to understand compositional structures of multi-level semantics beyond individual and contextual properties. The task of temporal action segmentation, which aims at translating an untrimmed activity video into a sequence of action segments, remains challenging for this reason. This paper addresses the problem by introducing an effective activity grammar to guide neural predictions for temporal action segmentation. We propose a novel grammar induction algorithm that extracts a powerful context-free grammar from action sequence data. We also develop an efficient generalized parser that transforms frame-level probability distributions into a reliable sequence of actions according to the induced grammar with recursive rules. Our approach can be combined with any neural network for temporal action segmentation to enhance the sequence prediction and discover its compositional structure. Experimental results demonstrate that our method significantly improves temporal action segmentation in terms of both performance and interpretability on two standard benchmarks, Breakfast and 50 Salads.

Topik & Kata Kunci

Penulis (5)

D

Dayoung Gong

J

Joonseok Lee

D

Deunsol Jung

S

Suha Kwak

M

Minsu Cho

Format Sitasi

Gong, D., Lee, J., Jung, D., Kwak, S., Cho, M. (2023). Activity Grammars for Temporal Action Segmentation. https://arxiv.org/abs/2312.04266

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2023
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓