Abstract
In a Hilbert space setting \({\mathcal{H}}\), in order to minimize by fast methods a general convex lower semicontinuous and proper function \({\Phi }: {\mathcal{H}} \rightarrow \mathbb {R} \cup \{+\infty \}\), we analyze the convergence rate of the inertial proximal algorithms. These algorithms involve both extrapolation coefficients (including Nesterov acceleration method) and proximal coefficients in a general form. They can be interpreted as the discrete time version of inertial continuous gradient systems with general damping and time scale coefficients. Based on the proper setting of these parameters, we show the fast convergence of values and the convergence of iterates. In doing so, we provide an overview of this class of algorithms. Our study complements the previous Attouch–Cabot paper (SIOPT, 2018) by introducing into the algorithm time scaling aspects, and sheds new light on the Güler seminal papers on the convergence rate of the accelerated proximal methods for convex optimization.
Similar content being viewed by others
References
Alvarez, F., Attouch, H.: An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping. Set-Valued Anal. 9, 3–11 (2001)
Álvarez, F., Attouch, H., Bolte, J., Redont, P.: A second-order gradient-like dissipative dynamical system with Hessian-driven damping. Application to optimization and mechanics. J. Math. Pures Appl. 81, 747–779 (2002)
Apidopoulos, V., Aujol, J.-F., Dossal, Ch.: Convergence rate of inertial forward-backward algorithm beyond Nesterov’s rule. Math. Program. https://doi.org/10.1007/s10107-018-1350-9. HAL-01551873 (2018)
Attouch, H., Cabot, A.: Asymptotic stabilization of inertial gradient dynamics with time-dependent viscosity. J. Differ. Equ. 263, 5412–5458 (2017)
Attouch, H., Cabot, A.: Convergence rates of inertial forward-backward algorithms. SIAM J. Optim. 28, 849–874 (2018)
Attouch, H., Cabot, A., Chbani, Z., Riahi, H.: Inertial forward-backward algorithms with perturbations: application to Tikhonov regularization. J. Optim. Theory Appl. 179, 1–36 (2018)
Attouch, H., Chbani, Z., Riahi, H.: Fast proximal methods via time scaling of damped inertial dynamics. SIAM. J. Optim. 29, 2227–2256 (2019)
Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Program. Ser. B 168, 123–175 (2018)
Attouch, H., Chbani, Z., Riahi, H.: Rate of convergence of the Nesterov accelerated gradient method in the subcritical case α ≤ 3. ESAIM-COCV, 25 (2019). https://doi.org/10.1051/cocv/2017083
Attouch, H., Peypouquet, J.: The rate of convergence of Nesterov’s accelerated forward-backward method is actually faster than 1/k2. SIAM. J. Optim. 26, 1824–1834 (2016)
Aujol, J. -F., Dossal, Ch: Stability of over-relaxations for the forward-backward algorithm, application to FISTA. SIAM J. Optim. 25, 2408–2433 (2015)
Aujol, J.-F., Dossal, Ch.: Optimal rate of convergence of an ODE associated to the Fast Gradient Descent schemes for b > 0. https://hal.inria.fr/hal-01547251v2 (2017)
Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone Operator Theory in Hilbert Spaces. CMS Books in Mathematics. Springer, Cham (2011)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2, 183–202 (2009)
Boţ, R. I., Csetnek, E.R., László, S. C.: A second-order dynamical approach with variable damping to nonconvex smooth minimization. Appl. Anal. (2018). https://doi.org/10.1080/00036811.2018.1495330
Bonettini, S., Porta, F., Ruggiero, V.: A variable metric forward-backward method with extrapolation. SIAM. J. Sci. Comput. 38, A2558–A2584 (2016)
Burger, M., Sawatzky, A., Steidl, G.: First order algorithms in variational image processing. In: Glowinski, R., Osher, S., Yin, W (eds.) Splitting Methods in Communication, Imaging, Science, and Engineering, pp 345–407. Springer, Cham (2016)
Calatroni, L., Chambolle, A.: Backtracking strategies for accelerated descent methods with smooth composite objectives. SIAM J. Optim. 29, 1772–1798 (2019)
Chambolle, A., Dossal, Ch: On the convergence of the iterates of the “Fast Iterative Shrinkage/Thresholding Algorithm”. J. Optim. Theory Appl. 166, 968–982 (2015)
Combettes, P.L., Glaudin, L.E.: Proximal activation of smooth functions in splitting algorithms for convex image recovery. SIAM J. Imaging Sci. (2019). To appear
Combettes, P.L., Wajs, V.R.: Signal recovery by proximal forward-backward splitting. Multiscale Model Simul. 4, 1168–1200 (2005)
Güler, O.: On the convergence of the proximal point algorithm for convex optimization. SIAM J. Control Optim. 29, 403–419 (1991)
Güler, O.: New proximal point algorithms for convex minimization. SIAM J. Optim. 2, 649–664 (1992)
Kim, D., Fessler, J.A.: Optimized first-order methods for smooth convex minimization. Math. Program. 159, 81–107 (2016)
Liang, J., Fadili, J., Peyré, G.: Local linear convergence of forward-backward under partial smoothness. In: Ghahramani, Z., et al. (eds.) Advances in Neural Information Processing Systems 27, pp 1970–1978. Curran Associates Inc. (2014)
Lorenz, D.A., Pock, Th.: An inertial forward-backward algorithm for monotone inclusions. J. Math. Imaging Vis. 51, 311–325 (2015)
May, R.: Asymptotic for a second-order evolution equation with convex potential and vanishing damping term. Turk. J. Math. 41, 681–685 (2017)
Nesterov, Y.: A method of solving a convex programming problem with convergence rate O(1/k2). Soviet. Math. Dokl. 27, 372–376 (1983)
Nesterov, Y.: Introductory Lectures on Convex Optimization: A Basic Course. Applied Optimization, vol. 87. Kluwer Academic Publishers, Boston (2004)
Opial, Z.: Weak convergence of the sequence of successive approximations for nonexpansive mappings. Bull. Am. Math. Soc. 73, 591–597 (1967)
Parikh, N., Boyd, S.: Proximal algorithms. Foundations and Trends in optimization, vol. 1, pp. 127–239 (2013)
Peypouquet, J.: Convex Optimization in Normed Spaces: Theory, Methods and Examples. Springer, Cham (2015)
Peypouquet, J., Sorin, S.: Evolution equations for maximal monotone operators: asymptotic analysis in continuous and discrete time. J. Convex Anal. 17, 1113–1163 (2010)
Polyak, B.T.: Some methods of speeding up the convergence of iteration methods. USSR Comput Math. Math. Phys. 4, 1–17 (1964)
Polyak, B.T.: Introduction to Optimization. New York: Optimization Software (1987)
Scheinberg, K., Goldfarb, D., Bai, X.: Fast first-order methods for composite convex optimization with backtracking. Found. Comput. Math. 14, 389–417 (2014)
Shi, B., Du, S.S., Jordan, M.I., Su, W.J.: Understanding the acceleration phenomenon via high-resolution differential equations. arXiv:1810.08907 (2018)
Schmidt, M., Le Roux, N., Bach, F.: Convergence rates of inexact proximal-gradient methods for convex optimization. NIPS’11 - 25th Annual Conference on Neural Information Processing Systems, Dec 2011, Grenada, Spain. HAL-inria-00618152v3 (2011)
Su, W.J., Boyd, S., Candès, E.J.: A Differential Equation for Modeling Nesterov’s Accelerated Gradient Method: Theory and Insights. In: Ghahramani, Z., et al. (eds.) Advances Neural Information Processing Systems 27, pp 2510–2518. Curran Associates Inc. (2014)
Villa, S., Salzo, S., Baldassarre, L., Verri, A.: Accelerated and inexact forward-backward algorithms. SIAM J. Optim. 23, 1607–1633 (2013)
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper is dedicated to Professor Marco A. López Cerdá on the occasion of his 70th birthday.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix: Some Auxiliary Results
Appendix: Some Auxiliary Results
The following lemmas are used throughout the paper. To establish the weak convergence of the iterates of (IP)\(_{\alpha _{k}, \beta _{k}}\), we apply Opial’s Lemma [30], that we recall in its discrete form.
Lemma 4
Let S be a nonempty subset of \(\mathcal H\), and (xk) a sequence in \({\mathcal{H}}\). Assume that
- (i)
every sequential weak cluster point of (xk) as \(k\to +\infty \), belongs to S;
- (ii)
for every z ∈ S, \(\lim _{k\to +\infty }\|x_{k}-z\|\) exists.
Then (xk) converges weakly as \(k \to +\infty \) to a point in S.
Owing to the next lemma, we are able to estimate the rate of convergence of a sequence (εk) supposed to be non-increasing and summable with respect to weight coefficients, see [5, Lemma 21] for the proof.
Lemma 5
Let (τk) be a nonnegative sequence such that \({\sum }_{k=1}^{+\infty } \tau _{k}=+\infty \). Assume that (εk) is a non-negative and non-increasing sequence satisfying \({\sum }_{k=1}^{+\infty } \tau _{k} \varepsilon _{k}<+\infty \). Then we have
The following result shows the summability of a sequence (ak) satisfying a suitable inequality.
Lemma 6
Given a non-negative sequence (αk) satisfying (K0), let (tk) be the sequence defined by \(t_{k}=1+{\sum }_{i=k}^{+\infty }{\prod }_{j=k}^{i}\alpha _{j}\). Let (ak) and (ωk) be two nonnegative sequences such that
for all k ≥ 0. If \({\sum }_{k=0}^{+\infty }t_{k+1}\omega _{k}<+\infty \), then \({\sum }_{k=0}^{+\infty }a_{k}<+\infty \).
Proof
By Lemma 1, we have tk+ 1αk = tk − 1. Multiplying inequality (51) by tk+ 1 gives
or equivalently ak ≤ (tkak − tk+ 1ak+ 1) + tk+ 1ωk. By summing from k = 0 to n, we obtain
The conclusion follows by letting n tend to \(+\infty \). □
Lemma 7
[8, Lemma 5.14] Let (ak) be a sequence of nonnegative numbers such that, for all \(k\in \mathbb {N}\), \({a_{k}^{2}} \leq c^{2} + {\sum }_{j=1}^{k} b_{j} a_{j}\), where (bj) is a summable sequence of nonnegative numbers, and c ≥ 0. Then, for all \(k\in \mathbb {N}\), \(a_{k} \leq c + {\sum }_{j=1}^{\infty } b_{j}\).
Rights and permissions
About this article
Cite this article
Attouch, H., Chbani, Z. & Riahi, H. Convergence Rate of Inertial Proximal Algorithms with General Extrapolation and Proximal Coefficients. Vietnam J. Math. 48, 247–276 (2020). https://doi.org/10.1007/s10013-020-00399-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10013-020-00399-y
Keywords
- Inertial proximal algorithms
- General extrapolation coefficient
- Lyapunov analysis
- Nesterov accelerated gradient method
- Nonsmooth convex optimization
- Time rescaling