Tight Complexity Bounds for Composite Optimization

We provide tight upper and lower bounds on the complexity of minimizing the average of $m$ convex functions using gradient and prox information for the component functions. We show a significant gap between the complexity of deterministic vs randomized optimization. For smooth functions, we show that accelerated gradient descent and an accelerated variant of SVRG are optimal in the deterministic and randomized settings respectively, and that a gradient oracle is sufficient for that optimal rate. For non-smooth functions, having access to prox oracle reduces the complexity and we present optimal methods based on smoothing AGD that improve over methods using just gradient accesses.

Date:
Speakers:
Blake Woodworth
Affiliation:
Toyota Technological Institute at Chicago