Region-Aware and Temporally Consistent Human-Object Contact Detection in Video
Abstrak
We propose a method for estimating human-object contact in video using deep learning, in which images with expanded human regions are fed into the model to enable more efficient feature extraction. When estimating human-object contact in video, it is important to remove unnecessary information from the image and direct the model’s attention to the relevant human regions. We propose a method that reduces background information and allows human body parts to be focused on by detecting the bounding boxes of human regions using object detection techniques and then inputting the images with these regions expanded into the model. Furthermore, we propose a novel approach that extends the input from still images to video sequences and introduces a new loss function that imposes constraints along the temporal dimension, thereby enabling the model to learn temporal information effectively. This study aims to further improve the accuracy of human-object contact estimation by modifying the model’s input and loss function.
Topik & Kata Kunci
Penulis (2)
Kaito Kira
Koichi Ichige
Akses Cepat
PDF tidak tersedia langsung
Cek di sumber asli →- Tahun Terbit
- 2026
- Sumber Database
- DOAJ
- DOI
- 10.1109/ACCESS.2026.3663594
- Akses
- Open Access ✓