Abstract
We present DeepNav, a Convolutional Neural Network
(CNN) based algorithm for navigating large cities using locally visible street-view images. The DeepNav agent learns
to reach its destination quickly by making the correct navigation decisions at intersections. We collect a large-scale
dataset of street-view images organized in a graph where
nodes are connected by roads. This dataset contains 10 city
graphs and more than 1 million street-view images. We propose 3 supervised learning approaches for the navigation
task and show how A* search in the city graph can be used
to generate supervision for the learning. Our annotation
process is fully automated using publicly available mapping services and requires no human input. We evaluate
the proposed DeepNav models on 4 held-out cities for navigating to 5 different types of destinations. Our algorithms
outperform previous work that uses hand-crafted features
and Support Vector Regression (SVR) [19].