Abstract
In this paper, we consider first-order methods for solving stochastic non-convex optimization problems. The key building block of the proposed algorithms is firstorder procedures to extract negative curvature from the Hessian matrix through a principled sequence starting from noise, which are referred to NEgative-curvatureOriginated-from-Noise or NEON and are of independent interest. Based on this building block, we design purely first-order stochastic algorithms for escaping from non-degenerate saddle points with a much better time complexity (almost linear time in the problem’s dimensionality) under a bounded variance condition of stochastic gradients than previous first-order stochastic algorithms. In particular, we develop a general framework of first-order stochastic algorithms with a secondorder convergence guarantee based on our new technique and existing algorithms that may only converge to a first-order stationary point. For finding a nearly second-order stationary point x such that (in high probability), the best time complexity of the presented algorithms is , where F (·) denotes the objective function and d is the dimensionality of the problem. To the best of our knowledge, this is the first theoretical result of first-order stochastic algorithms with an almost linear time in terms of problem’s dimensionality for finding second-order stationary points, which is even competitive with existing stochastic algorithms hinging on the second-order information.