Abstract State-of-the-art live face verifification methods would be easily attacked by recorded facial expression sequence. This work directly addresses this issue via proposing a patch-wise motion parameterization based verifification network infrastructure. This method directly explores the underlying subtle motion difference between the facial movements re-captured from a planer screen (e.g., a pad) and those from a real face; therefore, interactive facial expression is no longer required. Furthermore, inspired by the fact that, we embed our network into a multiple instance learning framework, which further enhance the recall rate of the proposed technique. Extensive experimental results on several face benchmarks well demonstrate the superior performance of our method