Lip Movements Generation at a Glance

资源分类

2019-10-23 |

58 |

43 |

Abstract. In this paper, we consider the task: given an arbitrary audio speech and one lip image of arbitrary target identity, generate synthesized lip movements of the target identity saying the speech. To perform well, a model needs to not only consider the retention of target identity, photo-realistic of synthesized images, consistency and smoothness of lip images in a sequence, but more importantly, learn the correlations between audio speech and lip movements. To solve the collective problems, we devise a network to synthesize lip movements and propose a novel correlation loss to synchronize lip changes and speech changes. Our full model utilizes four losses for a comprehensive consideration; it is trained end-to-end and is robust to lip shapes, view angles and different facial characteristics. Thoughtful experiments on three datasets ranging from lab-recorded to lips in-the-wild show that our model significantly outperforms other state-of-the-art methods extended to this task.

上一篇：Bidirectional Feature Pyramid Network with Recurrent Attention Residual Modules for Shadow Detection

下一篇：EC-Net: an Edge-aware Point set Consolidation Network

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com