Abstract
Developing visual perception models for active agents
and sensorimotor control in the physical world are cumbersome as existing algorithms are too slow to efficiently
learn in real-time and robots are fragile and costly. This
has given rise to learning-in-simulation which consequently
casts a question on whether the results transfer to realworld. In this paper, we investigate developing real-world
perception for active agents, propose Gibson Environment
1
for this purpose, and showcase a set of perceptual tasks
learned therein. Gibson is based upon virtualizing real
spaces, rather than artificially designed ones, and currently
includes over 1400 floor spaces from 572 full buildings.
The main characteristics of Gibson are: I. being from the
real-world and reflecting its semantic complexity, II. having an internal synthesis mechanism “Goggles” enabling
deploying the trained models in real-world without needing
domain adaptation, III. embodiment of agents and making
them subject to constraints of physics and space.