Scene Semantics from Long-Term Observation of People

资源分类

2020-04-02 |

69 |

35 |

Abstract

Our everyday ob jects support various tasks and can be used by people for different purposes. While ob ject classification is a widely studied topic in computer vision, recognition of ob ject function, i.e., what people can do with an ob ject and how they do it, is rarely addressed. In this paper we construct a functional ob ject description with the aim to recognize ob jects by the way people interact with them. We describe scene ob jects (sofas, tables, chairs) by associated human poses and ob- ject appearance. Our model is learned discriminatively from automat- ically estimated body poses in many realistic scenes. In particular, we make use of time-lapse videos from YouTube providing a rich source of common human-ob ject interactions and minimizing the effort of man- ual ob ject annotation. We show how the models learned from human observations significantly improve ob ject recognition and enable predic- tion of characteristic human poses in new scenes. Results are shown on a dataset of more than 400,000 frames obtained from 146 time-lapse videos of challenging and realistic indoor scenes.

上一篇：Joint Image and Word Sense Discrimination for Image Retrieval

下一篇：Towards Optimal Design of Time and Color Multiplexing Codes

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com