An Efficient Posterior Regularized Latent Variable Model for Interactive Sound Source Separation

资源分类

2020-03-02 |

52 |

37 |

Abstract

In applications such as audio denoising, music transcription, music remixing, and audiobased forensics, it is desirable to decompose a single-channel recording into its respective sources. One of the current most effective class of methods to do so is based on nonnegative matrix factorization and related latent variable models. Such techniques, however, typically perform poorly when no isolated training data is given and do not allow user feedback to correct for poor results. To overcome these issues, we allow a user to interactively constrain a latent variable model by painting on a time-frequency display of sound to guide the learning process. The annotations are used within the framework of posterior regularization to impose linear grouping constraints that would otherwise be difficult to achieve via standard priors. For the constraints considered, an efficient expectation-maximization algorithm is derived with closed-form multiplicative updates, drawing connections to nonnegative matrix factorization methods, and allowing for high-quality interactive-rate separation without explicit training data.

上一篇：Bayesian Learning of Recursively Factored Environments

下一篇：Inference algorithms for pattern-based CRFs on sequence data

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com