Alt Text: Monocular Depth Estimation for Autonomous Devices Title: Monocular Depth Estimation for Autonomous Devices Caption: Advancing depth estimation for autonomous devices using large-scale data and deep learning strategies. Description: This study introduces Depth Anything, a robust monocular depth estimation solution leveraging large-scale data annotation and semantic supervision. The framework improves generalization, enhances zero-shot capabilities, and refines depth models through fine-tuning with NYUv2 and KITTI datasets. Keywords: Depth Estimation, Autonomous Devices, Machine Learning, ControlNet, Semantic Supervision
International Journal of Engineering and Techniques – Volume 10 Issue 3, June 2024
T. Sai Prasad Reddy1, V. Chaitanya2 1Associate Professor, Department of Computer Science & Engineering, Geethanjali Institute of Science and Technology, Gangavaram, Andhra Pradesh, India. 2Assistant Professor, Department of Computer Science & Engineering, Geethanjali Institute of Science and Technology, Gangavaram, Andhra Pradesh, India.
Abstract
This work presents Depth Anything, a highly practical solution for robust monocular depth estimation. Without pursuing novel technical modules, the aim is to build a powerful foundation model handling images in various conditions. A data engine is designed to collect and annotate large-scale unlabeled data (∼62M), significantly enhancing data coverage and reducing generalization errors. Two strategies strengthen data scaling: data augmentation that pushes models to acquire robust visual knowledge and auxiliary supervision that inherits semantic priors from pre-trained encoders. Evaluations on six public datasets and randomly captured photos demonstrate strong generalization. Fine-tuning with NYUv2 and KITTI datasets sets new SOTA benchmarks, and improved depth models lead to superior depth-conditioned ControlNet.
Reddy, T.S.P., Chaitanya, V., “Monocular Depth Estimation for Autonomous Devices,” International Journal of Engineering and Techniques, Volume 10, Issue 3, June 2024. ISSN 2395-1303
Post Comment