arXiv Open Access 2025

Semantic-E2VID: a Semantic-Enriched Paradigm for Event-to-Video Reconstruction

Jingqian Wu Yunbo Jia Shengpeng Xu Edmund Y. Lam
Lihat Sumber

Abstrak

Event cameras provide a promising sensing modality for high-speed and high-dynamic-range vision by asynchronously capturing brightness changes. A fundamental task in event-based vision is event-to-video (E2V) reconstruction, which aims to recover intensity videos from event streams. Most existing E2V approaches formulate reconstruction as a temporal--spatial signal recovery problem, relying on temporal aggregation and spatial feature learning to infer intensity frames. While effective to some extent, this formulation overlooks a critical limitation of event data: due to the change-driven sensing mechanism, event streams are inherently semantically under-determined, lacking object-level structure and contextual information that are essential for faithful reconstruction. In this work, we revisit E2V from a semantic perspective and argue that effective reconstruction requires going beyond temporal and spatial modeling to explicitly account for missing semantic information. Based on this insight, we propose \textit{Semantic-E2VID}, a semantic-enriched end-to-end E2V framework that reformulates reconstruction as a process of semantic learning, fusing and decoding. Our approach first performs semantic abstraction by bridging event representations with semantics extracted from a pretrained Segment Anything Model (SAM), while avoiding modality-induced feature drift. The learned semantics are then fused into the event latent space in a representation-compatible manner, enabling event features to capture object-level structure and contextual cues. Furthermore, semantic-aware supervision is introduced to explicitly guide the reconstruction process toward semantically meaningful regions, complementing conventional pixel-level and temporal objectives. Extensive experiments on six public benchmarks demonstrate that Semantic-E2VID consistently outperforms state-of-the-art E2V methods.

Topik & Kata Kunci

Penulis (4)

J

Jingqian Wu

Y

Yunbo Jia

S

Shengpeng Xu

E

Edmund Y. Lam

Format Sitasi

Wu, J., Jia, Y., Xu, S., Lam, E.Y. (2025). Semantic-E2VID: a Semantic-Enriched Paradigm for Event-to-Video Reconstruction. https://arxiv.org/abs/2510.17347

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2025
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓