Abstract
There is growing interest in artificial intelligence to build
socially intelligent robots. This requires machines to have
the ability to “read” people’s emotions, motivations, and
other factors that affect behavior. Towards this goal, we introduce a novel dataset called MovieGraphs which provides
detailed, graph-based annotations of social situations depicted in movie clips. Each graph consists of several types of
nodes, to capture who is present in the clip, their emotional
and physical attributes, their relationships (i.e., parent/child),
and the interactions between them. Most interactions are
associated with topics that provide additional details, and
reasons that give motivations for actions. In addition, most
interactions and many attributes are grounded in the video
with time stamps. We provide a thorough analysis of our
dataset, showing interesting common-sense correlations between different social aspects of scenes, as well as across motivations, and other factors that affect behavior. Furthermore, it requires understanding social and cultural norms,
and being aware of the implications of one’s actions. The
increasing interest in social chat bots and personal assistants [1, 4, 18, 22, 27, 42] points to the importance of teaching artificial agents to understand the subtleties of human
social interactions