Abstract
This work presents a modular and hierarchical approach to learn policies for exploring 3D environments. Our approach leverages the strengths of both classical and learning-based methods, by using analytical path planners with learned mappers, and global and local policies. Use of learning provides flexibility with respect to input modalities (in mapper), leverages structural regularities of the world (in global policies), and provides robustness to errors in state estimation (in local policies). Such use of learning within each module retains its benefits, while at the same time, hierarchical decomposition and modular training allow us to sidestep the high sample complexities associated with training end-to-end policies. Our experiments in visually and physically realistic simulated 3D environments demonstrate the effectiveness of our approach over past learning and geometry-based approaches.1