Source
Optical Memory and Neural Networks
DATE OF PUBLICATION
07/04/2024
Authors
Dmitry Yudin Youshaa Murhij
Share

DAGM-Mono: Deformable Attention-Guided Modeling for Monocular 3D Reconstruction

Abstract

Accurate 3D pose estimation and shape reconstruction from monocular images is a challenging task in the field of autonomous driving. Our work introduces a novel approach to solve this task for vehicles called Deformable Attention-Guided Modeling for Monocular 3D Reconstruction (DAGM-Mono). Our proposed solution addresses the challenge of detailed shape reconstruction by leveraging deformable attention mechanisms. Specifically, given 2D primitives, DAGM-Mono reconstructs vehicles shapes using deformable attention-guided modeling, considering the relevance between detected objects and vehicle shape priors. Our method introduces two additional loss functions: Chamfer Distance (CD) and Hierarchical Chamfer Distance to enhance the process of shape reconstruction by additionally capturing fine-grained shape details at different scales. Our bi-contextual deformable attention framework estimates 3D object pose, capturing both inter-object relations and scene context. Experiments on the ApolloCar3D dataset demonstrate that DAGM-Mono achieves state-of-the-art performance and significantly enhances the performance of mature monocular 3D object detectors. Code and data are publicly available at: https://github.com/YoushaaMurhij/DAGM-Mono.

Join AIRI