Rate-Invariant Analysis of Trajectories on Riemannian Manifolds with Application in Visual Speech Recognition

资源分类

2019-12-11 |

63 |

43 |

Abstract

In statistical analysis of video sequences for speech recognition, and more generally activity recognition, it is natural to treat temporal evolutions of features as trajectories on Riemannian manifolds. However, different evolution patterns result in arbitrary parameterizations of these trajectories. We investigate a recent framework from statistics literature [15] that handles this nuisance variability using a cost function/distance for temporal registration and statistical summarization & modeling of trajectories. It is based on a mathematical representation of trajectories, termed transported square-root vector fifield (TSRVF), and the L2 norm on the space of TSRVFs. We apply this framework to the problem of speech recognition using both audio and visual components. In each case, we extract features, form trajectories on corresponding manifolds, and compute parametrization-invariant distances using TSRVFs for speech classifification. On the OuluVS database the classifification performance under metric increases signifificantly, by nearly 100% under both modalities and for all choices of features. We obtained speaker-dependent classifification rate of 70% and 96% for visual and audio components, respectively

上一篇：Learning Everything about Anything: Webly-Supervised Visual Concept Learning

下一篇：StoryGraphs: Visualizing Character Interactions as a Timeline

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com