Accelerated Stochastic Gradient-free and Projection-free Methods

Part of Proceedings of the International Conference on Machine Learning 1 pre-proceedings (ICML 2020)

Bibtex »Metadata »Paper »Supplemental »

Bibtek download is not availble in the pre-proceeding


Authors

Feihu Huang, Lue Tao, Songcan Chen

Abstract

In the paper, we propose a class of accelerated stochastic gradient-free and projection-free (a.k.a., zeroth-order Frank Wolfe) methods to solve the problem of constrained stochastic and finite-sum nonconvex optimization. Specifically, we propose an accelerated stochastic zeroth-order Frank Wolfe (Acc-SZOFW) method based on the variance reduced technique and a novel momentum technique. Moreover, under some mild conditions, we prove that the Acc-SZOFW has the function query complexity of $O(d\sqrt{n}\epsilon^{-2})$ for finding an $\epsilon$-stationary point in the finite-sum problem, which improves the exiting best result by a factor of $O(\sqrt{n}\epsilon^{-2})$, and has the function query complexity of $O(d\epsilon^{-3})$ in the stochastic problem, which improves the exiting best result by a factor of $O(\epsilon^{-1})$. Further, we propose a novel accelerated stochastic zeroth-order Frank Wolfe (Acc-SZOFW*) to relax the large mini-batch size required in the Acc-SZOFW. In particular, we prove that the Acc-SZOFW* still has the function query complexity of $O(d\epsilon^{-3})$ in the stochastic problem. Finally, we use extensive experiments including black-box adversarial attack and robust black-box classification to verify the efficiency of our algorithms.