Self-Supervised Depth Estimation and 3D Reconstruction With Layer-Wise LoRA of Foundation Model in Endoscopy
Abstrak
Depth estimation is crucial for 3D reconstruction and surgical navigation, providing critical insights for endoscopic procedures. While foundation models excel in depth estimation for natural images, their performance in the medical domain remains limited, particularly under challenging conditions like brightness fluctuations. This study develops a robust self-supervised framework for monocular depth estimation to address these challenges. We introduce a layer-wise low-rank adaptation (LW-LoRA) of the Depth-Anything-V2 foundation model, tailored for endoscopic data. Unlike conventional fine-tuning, LW-LoRA assigns an empirically determined rank vector across the encoder layers for efficient training. The method integrates residual convolutional blocks (ResConv) to capture fine-grained details and a multi-head attention-based pose network to enhance camera pose estimation, ensuring accurate 3D reconstructions. A multi-scale SSIM (MS-SSIM) reprojection loss refines depth predictions, while a brightness calibration module ensures robustness against illumination inconsistencies. During training, the backbone encoder is frozen, optimizing only the LoRA layers for efficiency. Extensive evaluations on the SCARED dataset highlight the superior performance of our framework, offering faster inference and high-quality depth maps. Zero-shot testing on Hamlyn and clinical datasets confirms its generalization across diverse data types. Our framework efficiently adapts the foundation model for depth estimation in the medical domain, addressing challenges in endoscopic imaging, such as brightness variations and fine-detail preservation. It enables accurate, dense 3D point cloud reconstructions, ensuring reliable performance in clinical settings.
Topik & Kata Kunci
Penulis (4)
Saad Khalil
Sol Kim
Bo-In Lee
Youngbae Hwang
Akses Cepat
PDF tidak tersedia langsung
Cek di sumber asli →- Tahun Terbit
- 2025
- Sumber Database
- DOAJ
- DOI
- 10.1109/ACCESS.2025.3617567
- Akses
- Open Access ✓