arXiv Open Access 2026

Implicit-Scale 3D Reconstruction for Multi-Food Volume Estimation from Monocular Images

Yuhao Chen Gautham Vinod Siddeshwar Raghavan Talha Ibn Mahmud Bruce Coburn +3 lainnya
Lihat Sumber

Abstrak

We present Implicit-Scale 3D Reconstruction from Monocular Multi-Food Images, a benchmark dataset designed to advance geometry-based food portion estimation in realistic dining scenarios. Existing dietary assessment methods largely rely on single-image analysis or appearance-based inference, including recent vision-language models, which lack explicit geometric reasoning and are sensitive to scale ambiguity. This benchmark reframes food portion estimation as an implicit-scale 3D reconstruction problem under monocular observations. To reflect real-world conditions, explicit physical references and metric annotations are removed; instead, contextual objects such as plates and utensils are provided, requiring algorithms to infer scale from implicit cues and prior knowledge. The dataset emphasizes multi-food scenes with diverse object geometries, frequent occlusions, and complex spatial arrangements. The benchmark was adopted as a challenge at the MetaFood 2025 Workshop, where multiple teams proposed reconstruction-based solutions. Experimental results show that while strong vision--language baselines achieve competitive performance, geometry-based reconstruction methods provide both improved accuracy and greater robustness, with the top-performing approach achieving 0.21 MAPE in volume estimation and 5.7 L1 Chamfer Distance in geometric accuracy.

Topik & Kata Kunci

Penulis (8)

Y

Yuhao Chen

G

Gautham Vinod

S

Siddeshwar Raghavan

T

Talha Ibn Mahmud

B

Bruce Coburn

J

Jinge Ma

F

Fengqing Zhu

J

Jiangpeng He

Format Sitasi

Chen, Y., Vinod, G., Raghavan, S., Mahmud, T.I., Coburn, B., Ma, J. et al. (2026). Implicit-Scale 3D Reconstruction for Multi-Food Volume Estimation from Monocular Images. https://arxiv.org/abs/2602.13041

Akses Cepat

Lihat di Sumber
Informasi Jurnal
Tahun Terbit
2026
Bahasa
en
Sumber Database
arXiv
Akses
Open Access ✓