NeuralFusion: Online Depth Fusion in Latent Space

Silvan Weder; Johannes L. Schönberger; Marc Pollefeys; Martin R. Oswald

NeuralFusion: Online Depth Fusion in Latent Space

Silvan Weder ,
Johannes L. Schönberger ,
Marc Pollefeys ,
Martin R. Oswald

CVPR 2021 | June 2021

Download BibTex

We present a novel online depth map fusion approach that learns depth map aggregation in a latent feature space. While previous fusion methods use an explicit scene representation like signed distance functions (SDFs), we propose a learned feature representation for the fusion. The key idea is a separation between the scene representation used for the fusion and the output scene representation, via an additional translator network. Our neural network architecture consists of two main parts: a depth and feature fusion sub-network, which is followed by a translator sub-network to produce the final surface representation (e.g. TSDF) for visualization or other tasks. Our approach is real-time capable, handles high noise levels, and is particularly able to deal with gross outliers common for photometric stereo-based depth maps. Experiments on real and synthetic data demonstrate improved results compared to the state of the art, especially in challenging scenarios with large amounts of noise and outliers.