Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter

Zeyuan Allen-Zhu

Natasha: Faster Non-Convex Stochastic Optimization Via Strongly Non-Convex Parameter

Zeyuan Allen-Zhu

| February 2017

Download BibTex

Given a non-convex function f(x) that is an average of n smooth functions, we design stochastic first-order methods to find its approximate stationary points. The performance of our new methods depend on the smallest (negative) eigenvalue −σ of the Hessian. This parameter σ captures how strongly non-convex f(x) is, and is analogous to the strong convexity parameter for convex optimization.

Our methods outperform the best known results for a wide range of σ, and can also be used to find approximate local minima.
In particular, we find an interesting dichotomy: there exists a threshold σ₀ so that the fastest methods for σ>σ₀ and for σ<σ₀ have drastically different behaviors: the former scales with n^2/3 and the latter scales with n^3/4.