Abstract
Nonconvex optimization algorithms with random
initialization have attracted increasing attention recently. It has been showed that many first-order
methods always avoid saddle points with random
starting points. In this paper, we answer a question:
can the nonconvex heavy-ball algorithms with random initialization avoid saddle points? The answer
is yes! Direct using the existing proof technique
for the heavy-ball algorithms is hard due to that
each iteration of the heavy-ball algorithm consists
of current and last points. It is impossible to formulate the algorithms as iteration like xk+1 = g(xk)
under some mapping g. To this end, we design a
new mapping on a new space. With some transfers, the heavy-ball algorithm can be interpreted
as iterations after this mapping. Theoretically, we
prove that heavy-ball gradient descent enjoys larger
stepsize than the gradient descent to escape saddle
points to escape the saddle point. And the heavyball proximal point algorithm is also considered;
we also proved that the algorithm can always escape the saddle point