Listen, Think and Listen Again: Capturing Top-down Auditory Attention for Speaker-independent Speech Separation

资源分类

2019-11-07 |

86 |

59 |

Abstract Recent deep learning methods have made significant progress in multi-talker mixed speech separation. However, most existing models adopt a driftless strategy to separate all the speech channels rather than selectively attend the target one. As a result, those frameworks may be failed to offer a satisfactory solution in complex auditory scene where the number of input sounds is usually uncertain and even dynamic. In this paper, we present a novel neural network based structure motivated by the top-down attention behavior of human when facing complicated acoustical scene. Different from previous works, our method constructs an inferenceattention structure to predict interested candidates and extract each speech channel of them. Our work gets rid of the limitation that the number of channels must be given or the high computation complexity for label permutation problem. We evaluated our model on the WSJ0 mixed-speech tasks. In all the experiments, our model gets highly competitive to reach and even outperform the baselines.

上一篇：Functional Partitioning of Ontologies for Natural Language Query Completion in Question Answering Systems

下一篇：One “Ruler” for All Languages: Multi-Lingual Dialogue Evaluation with Adversarial Multi-Task Learning

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com