Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter

  • Zeyuan Allen-Zhu

|

Publication

Given a non-convex function f(x) that is an average of n smooth functions, we design stochastic first-order methods to find its approximate stationary points. The performance of our new methods depend on the smallest (negative) eigenvalue −σ of the Hessian. This parameter σ captures how strongly non-convex f(x) is, and is analogous to the strong convexity parameter for convex optimization.

Our methods outperform the best known results for a wide range of σ, and can also be used to find approximate local minima.
In particular, we find an interesting dichotomy: there exists a threshold σ0 so that the fastest methods for σ>σ0 and for σ<σ0 have drastically different behaviors: the former scales with n2/3 and the latter scales with n3/4.