Marigold-DC: Diffusion-Based Depth Completion Improves Robustness and Generalization

Depth Completion with Marigold-DC: A New Approach for Missing Depth Information

Depth maps captured by RGB-D sensors or LiDAR systems often provide incomplete data. Missing or noisy depth information makes both human interpretation and application in computer-aided systems difficult. Depth completion aims to transform these incomplete depth measurements into complete depth maps using color images.

Traditional methods mostly consider depth completion as an interpolation problem, where missing depth values are estimated based on the image content. A new approach, Marigold-DC, flips this concept and interprets the task as an image-conditioned generation of depth maps, guided by sparse measurements.

Marigold-DC: Diffusion for Depth Completion

Marigold-DC is based on a pre-trained latent diffusion model for monocular depth estimation. Diffusion models have proven themselves in image generation and are now increasingly used for other tasks such as depth estimation. At their core, these models generate images through an iterative process that gradually denoises a noisy image until a clear image emerges.

The innovation of Marigold-DC lies in the integration of sparse depth measurements as guidance during this denoising process. Through an optimization procedure that runs in parallel with the iterative inference of the diffusion model, the existing depth information is fed into the generation of the depth map. This approach enables effective use of the existing data and leads to more robust and accurate depth completion.

Zero-Shot Generalization: A Decisive Advantage

A remarkable feature of Marigold-DC is its ability for zero-shot generalization. This means that the model can be applied to images from environments that were not seen during the training process. This flexibility is a crucial advantage over conventional methods, which are often tied to specific datasets and lose performance when applied to unknown scenarios.

The zero-shot capability of Marigold-DC results from the use of the pre-trained diffusion model, which has already learned a broad spectrum of visual information. By targeted control with the sparse depth measurements, this knowledge can be effectively used for depth completion without requiring retraining of the model.

A Paradigm Shift in Depth Completion

Marigold-DC represents a paradigm shift in depth completion. Instead of viewing the task as an interpolation problem, it is interpreted as a generation process guided by sparse measurements. This approach allows for more effective use of pre-trained models and leads to improved zero-shot generalization.

The results of Marigold-DC suggest that modern priors for monocular depth estimation make depth completion significantly more robust. It might therefore be more appropriate to consider the task as the restoration of dense depth information from dense image pixels, guided by sparse depth measurements, rather than as inpainting of sparse depth information guided by an image.

Applications and Future Developments

The robust and flexible depth completion with Marigold-DC opens up a wide range of applications in areas such as robotics, autonomous driving, 3D reconstruction, and augmented reality. The ability to handle even extremely sparse data expands the range of applications and enables use in scenarios that were inaccessible to conventional methods.

Future research could focus on further improving the accuracy and efficiency of Marigold-DC, as well as on the integration of further sensor data and the development of specialized models for specific use cases.

```