Audio-Video Sensor Fusion with Probabilistic Graphical Models

资源分类

2020-03-23 |

43 |

31 |

Abstract

We present a new approach to modeling and processing mul- timedia data. This approach is based on graphical models that combine audio and video variables. We demonstrate it by developing a new al- gorithm for tracking a moving ob ject in a cluttered, noisy scene using two microphones and a camera. Our model uses unobserved variables to describe the data in terms of the process that generates them. It is therefore able to capture and exploit the statistical structure of the audio and video data separately, as well as their mutual dependencies. Model parameters are learned from data via an EM algorithm, and automatic calibration is performed as part of this procedure. Tracking is done by Bayesian inference of the ob ject location from data. We demonstrate suc- cessful performance on multimedia clips captured in real world scenarios using off-the-shelf equipment.

上一篇：Nonlinear Shape Statistics in Mumford–Shah Based Segmentation

下一篇：Model Acquisition by Registration of Multiple Acoustic Range Views

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com