Active Scene Reconstruction

With an RGBD sensor mounted on a robotic arm, we address the problem of autonomous 3D exploration of an indoor environment without a priori geometrical information. The objective is to obtain a complete coverage and 3D reconstruction of an area with as few movements as possible. We cast the problem as the estimation of the Next Best View (NBV) that maximises the coverage of the unknown area by only considering the current observation and the already explored areas. We proposed both a heuristics-driven and a data-driven solutions that effectively address this problem. This project was sponsored by an industrial project with Omron.

Related publications

Autonomous 3D reconstruction, mapping and exploration of indoor environments with a robotic arm

Y. Wang, S. James, S. E. Konstantina, C. Beltran-Gonzale, Y. Konishi, A. Del Bue, RAL, Nov 2019 [Paper]

In this work, we propose a novel information gain metric that combines hand-crafted and data-driven metrics to address the NBV problem. For the hand-crafted metric, we propose an entropy-based information gain that accounts for the previous view points to avoid the camera to revisit the same location and to promote the motion toward unexplored or occluded areas. Whereas for the learnt metric, we adopt a Convolutional Neural Network (CNN) architecture and formulate the problem as a classification problem. The CNN takes as input the current depth image and outputs the motion direction that suggests the largest unexplored surface. We train and test the CNN using a new synthetic dataset based on the SUNCG dataset. The learnt motion direction is then combined with the proposed hand-crafted metric to help handle situations where using only the hand-crafted metric tends to face ambiguities.

Where to explore next? ExHistCNN for history-aware autonomous 3D exploration

Y. Wang, A. Del Bue, ECCV, Glasgow, UK, Aug 2020 [Paper] [Code]

In the ECCV work, we re-formulate the NBV estimation as a classification problem and propose a novel learning-based metric that encodes both, the current 3D observation (a depth frame) and the history of the ongoing reconstruction. One of the major contributions of this work is about introducing a new representation for the 3D reconstruction history as an auxiliary utility map which is efficiently coupled with the current depth observation. With both pieces of information, we train a light-weight CNN, named ExHistCNN, that estimates the NBV as a set of directions towards which the depth sensor finds most unexplored areas. We perform extensive evaluation on both synthetic and real room scans demonstrating that the proposed ExHistCNN is able to approach the exploration performance of an oracle using the complete knowledge of the 3D environment.