Abstract
We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (S VRG) methods for them. S VRG and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (S GD); but their theoretical analysis almost exclusively assumes convexity. In contrast, we obtain non-asymptotic rates of convergence of S VRG for nonconvex optimization, showing that it is provably faster than S GD and gradient descent. We also analyze a subclass of nonconvex problems on which S VRG attains linear convergence to the global optimum. We extend our analysis to mini-batch variants, showing (theoretical) linear speedup due to minibatching in parallel settings.