A parallelizable augmented Lagrangian method applied to large-scale non-convex-constrained optimization problems

Boland, Natashia; Christiansen, Jeffrey; Dandurand, Brian; Eberhard, Andrew; Oliveira, Fabricio

doi:10.1007/s10107-018-1253-9

A parallelizable augmented Lagrangian method applied to large-scale non-convex-constrained optimization problems

Full Length Paper
Series A
Published: 01 March 2018

Volume 175, pages 503–536, (2019)
Cite this article

Mathematical Programming Submit manuscript

Natashia Boland¹,
Jeffrey Christiansen²,
Brian Dandurand²,
Andrew Eberhard² &
…
Fabricio Oliveira³

1692 Accesses
16 Citations
1 Altmetric
Explore all metrics

Abstract

We contribute improvements to a Lagrangian dual solution approach applied to large-scale optimization problems whose objective functions are convex, continuously differentiable and possibly nonlinear, while the non-relaxed constraint set is compact but not necessarily convex. Such problems arise, for example, in the split-variable deterministic reformulation of stochastic mixed-integer optimization problems. We adapt the augmented Lagrangian method framework to address the presence of nonconvexity in the non-relaxed constraint set and to enable efficient parallelization. The development of our approach is most naturally compared with the development of proximal bundle methods and especially with their use of serious step conditions. However, deviations from these developments allow for an improvement in efficiency with which parallelization can be utilized. Pivotal in our modification to the augmented Lagrangian method is an integration of the simplicial decomposition method and the nonlinear block Gauss–Seidel method. An adaptation of a serious step condition associated with proximal bundle methods allows for the approximation tolerance to be automatically adjusted. Under mild conditions optimal dual convergence is proven, and we report computational results on test instances from the stochastic optimization literature. We demonstrate improvement in parallel speedup over a baseline parallel approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallelizing the dual revised simplex method

Article Open access 14 December 2017

Decomposition and Parallelization of Linear Programming Algorithms

Asynchronous parallel algorithms for nonconvex optimization

Article 15 June 2019

Notes

Implementable is used in the sense of ability to implement the required functionality of an algorithm, not the meaning of implementable specific to stochastic optimization.

References

Birge, J.R., Louveaux, F.: Introduction to Stochastic Programming. Springer, Berlin (2011)
Book MATH Google Scholar
Shor, N.: Minimization Methods for Non-differentiable Functions. Springer, New York (1985)
Book MATH Google Scholar
Bertsekas, D.: Nonlinear Programming. Athena Scientific, Belmont (1999)
MATH Google Scholar
Ruszczyński, A.: Nonlinear Optimization. Princeton University Press, Princeton (2006)
Book MATH Google Scholar
Carøe, C.C., Schultz, R.: Dual decomposition in stochastic integer programming. Oper. Res. Lett. 24(1), 37–45 (1999)
Article MathSciNet MATH Google Scholar
Bertsekas, D.: Constrained Optimization and Lagrange Multiplier Methods. Academic Press, London (1982)
MATH Google Scholar
Hestenes, M.R.: Multiplier and gradient methods. J. Optim. Theory Appl. 4, 303–320 (1969)
Article MathSciNet MATH Google Scholar
Powell, M.J.D.: A method for nonlinear constraints in minimization problems. In: Fletcher, R. (ed.) Optimization. Academic Press, New York (1969)
Google Scholar
Rockafellar, R.: Monotone operators and the proximal point algorithm. SIAM J. Control Optim. 14(5), 877–898 (1976)
Article MathSciNet MATH Google Scholar
Lemaréchal, C.: An Extension of Davidon Methods to Non differentiable Problems, pp. 95–109. Springer, Berlin Heidelberg (1975)
de Oliveira, W., Sagastizábal, C.: Bundle methods in the XXIst century: a bird’s-eye view. Pesqui. Oper. 34, 647–670 (2014)
Article Google Scholar
Hare, W., Sagastizábal, C., Solodov, M.: A proximal bundle method for nonsmooth nonconvex functions with inexact information. Comput. Optim. Appl. 63, 1–28 (2016)
Article MathSciNet MATH Google Scholar
de Oliveira, W., Sagastizábal, C., Lemaréchal, C.: Convex proximal bundle methods in depth: a unified analysis for inexact oracles. Math. Program. 148(1), 241–277 (2014)
Article MathSciNet MATH Google Scholar
Amor, H.B., Desrosiers, J., Frangioni, A.: On the choice of explicit stabilizing terms in column generation. Discrete Appl. Math. 157(6), 1167–1184 (2009)
Article MathSciNet MATH Google Scholar
Bertsekas, D.: Incremental aggregated proximal and augmented Lagrangian algorithms. arXiv preprint arXiv:1509.09257 (2015)
Fischer, F., Helmberg, C.: A parallel bundle framework for asynchronous subspace optimization of nonsmooth convex functions. SIAM J. Optim. 24(2), 795–822 (2014)
Article MathSciNet MATH Google Scholar
Lubin, M., Martin, K., Petra, C., Sandıkçı, B.: On parallelizing dual decomposition in stochastic integer programming. Oper. Res. Lett. 41(3), 252–258 (2013)
Article MathSciNet MATH Google Scholar
Bertsekas, D.: Convex Optimization Algorithms. Athena Scientific, Belmont (2015)
MATH Google Scholar
Holloway, C.: An extension of the Frank and Wolfe method of feasible directions. Math. Program. 6(1), 14–27 (1974)
Article MathSciNet MATH Google Scholar
Von Hohenbalken, B.: Simplicial decomposition in nonlinear programming algorithms. Math. Program. 13(1), 49–68 (1977)
Article MathSciNet MATH Google Scholar
Bonettini, S.: Inexact block coordinate descent methods with application to non-negative matrix factorization. IMA J. Numer. Anal. 31(4), 1431–1452 (2011)
Article MathSciNet MATH Google Scholar
Grippo, L., Sciandrone, M.: On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Oper. Res. Lett. 26(3), 127–136 (2000)
Article MathSciNet MATH Google Scholar
Hildreth, C.: A quadratic programming procedure. Nav. Res. Logist. Q. 4, 79–85 (1957). 361
Article MathSciNet Google Scholar
Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109, 475–494 (2001)
Article MathSciNet MATH Google Scholar
Warga, J.: Minimizing certain convex functions. SIAM J. Appl. Math. 11, 588–593 (1963)
Article MathSciNet MATH Google Scholar
Eckstein, J.: A practical general approximation criterion for methods of multipliers based on Bregman distances. Math. Program. 96(1), 61–86 (2003)
Article MathSciNet MATH Google Scholar
Eckstein, J., Silva, P.: A practical relative error criterion for augmented lagrangians. Math. Program. 141(1), 319–348 (2013)
Article MathSciNet MATH Google Scholar
Hamdi, A., Mahey, P., Dussault, J.P.: Recent Advances in Optimization: Proceedings of the 8th French–German Conference on Optimization Trier, July 21–26, 1996, chap. A New Decomposition Method in Nonconvex Programming via a Separable Augmented Lagrangian, pp. 90–104. Springer Berlin Heidelberg, Berlin, Heidelberg (1997)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)
Article MATH Google Scholar
Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite element approximation. Comput. Math. Appl. 2, 17–40 (1976)
Article MATH Google Scholar
Glowinski, R., Marrocco, A.: Sur l’approximation, par elements finis d’ordre un, et la resolution, par penalisation-dualité, d’une classe de problems de dirichlet non lineares. Revue Française d’Automatique, Informatique, et Recherche Opérationelle 9, 41–76 (1975)
Article MATH Google Scholar
Chatzipanagiotis, N., Dentcheva, D., Zavlanos, M.: An augmented Lagrangian method for distributed optimization. Math. Program. 152(1), 405–434 (2014)
MathSciNet MATH Google Scholar
Tappenden, R., Richtárik, P., Büke, B.: Separable approximations and decomposition methods for the augmented Lagrangian. Optim. Methods Softw. 30(3), 643–668 (2015)
Article MathSciNet MATH Google Scholar
Mulvey, J., Ruszczyński, A.: A diagonal quadratic approximation method for large scale linear programs. Oper. Res. Lett. 12(4), 205–215 (1992)
Article MathSciNet MATH Google Scholar
Ruszczyński, A.: On convergence of an augmented Lagrangian decomposition method for sparse convex optimization. Math. Oper. Res. 20(3), 634–656 (1995)
Article MathSciNet MATH Google Scholar
Kiwiel, K., Rosa, C., Ruszczyński, A.: Proximal decomposition via alternating linearization. SIAM J. Optim. 9(3), 668–689 (1999)
Article MathSciNet MATH Google Scholar
Lin, X., Pham, M., Ruszczyński, A.: Alternating linearization for structured regularization problems. J. Mach. Learn. Res. 15, 3447–3481 (2014)
MathSciNet MATH Google Scholar
Chen, G., Teboulle, M.: A proximal-based decomposition method for convex minimization problems. Math. Program. 64, 81–101 (1994)
Article MathSciNet MATH Google Scholar
He, B., Liao, L.Z., Han, D., Yang, H.: A new inexact alternating directions method for monotone variational inequalities. Math. Program. 92, 103–118 (2002)
Article MathSciNet MATH Google Scholar
Boland, N., Christiansen, J., Dandurand, B., Eberhard, A., Linderoth, J., Luedtke, J., Oliveira, F.: Progressive hedging with a Frank–Wolfe based method for computing stochastic mixed-integer programming Lagrangian dual bounds. Optimization Online (2016). http://www.optimization-online.org/DB_HTML/2016/03/5391.html
Eckstein, J., Bertsekas, D.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55(1–3), 293–318 (1992)
Article MathSciNet MATH Google Scholar
Eckstein, J., Yao, W.: Understanding the Convergence of the Alternating Direction Method of Multipliers: Theoretical and Computational Perspectives. Rutgers University, New Brunswick (2014). Tech. rep
MATH Google Scholar
Mahey, P., Oualibouch, S., Tao, P.D.: Proximal decomposition on the graph of a maximal monotone operator. SIAM J. Optim. 5(2), 454–466 (1995)
Article MathSciNet MATH Google Scholar
Feizollahi, M.J., Costley, M., Ahmed, S., Grijalva, S.: Large-scale decentralized unit commitment. Int. J. Electr. Power Energy Syst. 73, 97–106 (2015)
Article Google Scholar
Gade, D., Hackebeil, G., Ryan, S.M., Watson, J.P., Wets, R.J.B., Woodruff, D.L.: Obtaining lower bounds from the progressive hedging algorithm for stochastic mixed-integer programs. Math. Program. 157(1), 47–67 (2016)
Article MathSciNet MATH Google Scholar
Rockafellar, R.T.: Augmented Lagrangians and applications of the proximal point algorithm in convex programming. Math. Oper. Res. 1(2), 97–116 (1976)
Article MathSciNet MATH Google Scholar
Rockafellar, R.: Convex Analysis. Princeton University Press, Princeton (1970)
Book MATH Google Scholar
Hathaway, R.J., Bezdek, J.C.: Grouped coordinate minimization using Newton’s method for inexact minimization in one vector coordinate. J. Optim. Theory Appl. 71(3), 503–516 (1991)
Article MathSciNet MATH Google Scholar
Kiwiel, K.C.: Approximations in proximal bundle methods and decomposition of convex programs. J. Optim. Theory Appl. 84(3), 529–548 (1995)
Article MathSciNet MATH Google Scholar
Bodur, M., Dash, S., Günlük, O., Luedtke, J.: Strengthened Benders cuts for stochastic integer programs with continuous recourse (2014). http://www.optimization-online.org/DB_FILE/2014/03/4263.pdf. Last Accessed 13 Jan (2015)
Ahmed, S., Garcia, R., Kong, N., Ntaimo, L., Parija, G., Qiu, F., Sen, S.: SIPLIB: A stochastic integer programming test problem library (2015). http://www.isye.gatech.edu/sahmed/siplib
Ntaimo, L.: Decomposition algorithms for stochastic combinatorial optimization: Computational experiments and extensions. Ph.D. thesis (2004)
The MathWorks, Natick: MATLAB 2012b (2014)
IBM Corporation: IBM ILOG CPLEX Optimization Studio CPLEX Users Manual. http://www.ibm.com/support/knowledgecenter/en/SSSA5P_12.6.1/ilog.odms.studio.help/pdf/usrcplex.pdf. Last Accessed 22 Aug (2016)
IBM Corporation: IBM ILOG CPLEX V12.5. http://www-01.ibm.com/software/commerce/optimization/cplex-optimizer/. Last Accessed 28 Jan (2016)
COmputational INfrastructure for Operations Research. http://www.coin-or.org/. Last Accessed 28 Jan (2016)
National Computing Infrastructure (NCI): NCI Website. http://www.nci.org.au. Last Accessed 19 Nov 2016
Gertz, E., Wright, S.: Object-oriented software for quadratic programming. ACM Trans. Math. Softw. 29(1), 58–81 (2003)
Article MathSciNet MATH Google Scholar
Lubin, M., Petra, C., Anitescu, M., Zavala, V.: Scalable stochastic optimization of complex energy systems. In: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 64:164:10. ACM, Seattle, WA (2011)
Clarke, F.: Optimization and Nonsmooth Analysis. Society for Industrial and Applied Mathematics (1990)

Download references

Author information

Authors and Affiliations

Georgia Institute of Technology, Atlanta, GA, USA
Natashia Boland
RMIT University, Melbourne, VIC, Australia
Jeffrey Christiansen, Brian Dandurand & Andrew Eberhard
Aalto University, Espoo, Finland
Fabricio Oliveira

Authors

Natashia Boland
View author publications
You can also search for this author in PubMed Google Scholar
Jeffrey Christiansen
View author publications
You can also search for this author in PubMed Google Scholar
Brian Dandurand
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Eberhard
View author publications
You can also search for this author in PubMed Google Scholar
Fabricio Oliveira
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrew Eberhard.

Additional information

This work was supported by the Australian Research Council (ARC) Grant ARC DP140100985.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 87 KB)

A Technical lemmas for establishing optimal convergence of SDM-GS

Given initial $(x^0,z^0) \in X \times Z \subset \mathbb {R}^n \times \mathbb {R}^q$, we consider the generation of the sequence $\left\{ (x^k,z^k) \right\} $ with iterations computed using Algorithm 4, whose target problem is given by

$$\begin{aligned} \min _{x,z} \left\{ F(x,z) : x \in X, z \in Z \right\} , \end{aligned}$$

(32)

where $(x,z) \mapsto F(x,z)$ is convex and continuously differentiable over $X \times Z$, and sets X and Z are closed and convex, with X bounded and $z \mapsto F(x,z)$ is inf-compact for each $x \in X$.

We define the directional derivative with respect to x as

$$\begin{aligned} F_x'(x,z;d) := \lim _{\alpha \downarrow 0} \frac{F(x+\alpha d,z) - F(x,z)}{\alpha }. \end{aligned}$$

Of interest is the satisfaction of the following local stationarity condition at $x \in X$:

$$\begin{aligned}&F_x'({x},{z};d) \ge 0 \quad \text {for all}\; d \in X - \left\{ {x} \right\} \end{aligned}$$

(33)

for any limit point $(x,z)=(\bar{x},\bar{z})$ of some sequence $\left\{ (x^k,z^k) \right\} $ of feasible solutions to problem (32). For the sake of nontriviality, we shall assume that the x-stationarity condition (33) never holds at $(x,z)=(x^k,z^k)$ for any $k \ge 0$. Thus, for each $x^k$, $k \ge 0$, there always exists a $d^k \in X-\left\{ x^k \right\} $ for which $F_x'(x^k,z^k;d^k) < 0$.

Direction Assumptions (DAs) For each iteration $k \ge 0$, given $x^k \in X$ and $z^k \in Z$, we have $d^k$ chosen so that (1) $x^k + d^k \in X$; and (2) $F_x'(x^k,z^k;d^k) < 0$.

Gradient Related Assumption (GRA) Given a sequence $\left\{ (x^k,z^k) \right\} $ with $\lim _{k \rightarrow \infty } (x^k,z^k) = (\overline{x},\overline{z})$, and a bounded sequence $\left\{ d^k \right\} $ of directions, then the existence of a direction $\overline{d} \in X - \left\{ \overline{x} \right\} $ such that $F_{x}'(\overline{x},\overline{z};\overline{d}) < 0$ implies that

$$\begin{aligned} \limsup _{k \rightarrow \infty } F_{x}'\left( x^k,z^k;d^k\right) < 0. \end{aligned}$$

(34)

In this case, we say that $\left\{ d^k \right\} $ is gradient related to $\left\{ x^k \right\} $. This gradient related condition is similar to the one defined in [3]. The sequence of directions ${d^k}$ is typically gradient related to $\left\{ x^k \right\} $ by construction. (See Lemma 7.)

To state the last assumption, we require the notion of an Armijo rule step length $\alpha ^k \in (0,1]$ given $(x^k,z^k,d^k)$ and parameters $\beta ,\sigma \in (0,1)$.

Remark 6

Under mild assumptions on F such as continuity that guarantee the existence of finite $F_x'(x,z;d)$ for all $(x,z,d) \in \left\{ (x,z,d) : x \in X, d \in X-\left\{ x \right\} , z \in Z \right\} $, we may assume that the while loop of Lines 3–5 terminates after a finite number of iterations. Thus, we have $\alpha ^k \in (0,1]$ for each $k \ge 1$.

The last significant assumption is stated as follows.

Sufficient Decrease Assumption (SDA) For sequences $\left\{ (x^k,z^k,d^k) \right\} $ and step lengths $\left\{ \alpha ^k \right\} $ computed according to Algorithm 4, we assume for each $k \ge 0$, that $(x^{k+1},z^{k+1})$ satisfies

$$\begin{aligned} F\left( x^{k+1},z^{k+1}\right) \le F\left( x^k + \alpha ^k d^k,z^k\right) . \end{aligned}$$

Lemma 6

For problem (32), let $F : \mathbb {R}^{n_x} \times \mathbb {R}^{n_z} \mapsto \mathbb {R}$ be convex and continuously differentiable, $X \subset \mathbb {R}^{n_x}$ convex and compact, and $Z \subseteq \mathbb {R}^{n_z}$ closed and convex. Furthermore, assume for each $x \in X$ that $z \mapsto F(x,z)$ is inf-compact. If a sequence $\left\{ (x^k,z^k,d^k) \right\} $ satisfies the DA, the GRA, and the SDA for some fixed $\beta ,\sigma \in (0,1)$, then the sequence ${(x^k,z^k)}$ has limit points $(\overline{x},\overline{z})$, each of which satisfies the stationarity condition (33).

Proof

The existence of limit points $(\overline{x},\overline{z})$ follows from the compactness of X, the inf-compactness of $z \mapsto F(x,z)$ for each $x \in X$, and the SDA. In generating $\left\{ \alpha ^k \right\} $ according to the Armijo rule as implemented in Lines 2–5 of Algorithm 4, we have

$$\begin{aligned} \frac{F\left( x^k + \alpha ^k d^k,z^k\right) - F\left( x^k,z^k\right) }{\alpha ^k} \le \sigma F_x'\left( x^k,z^k ; d^k\right) . \end{aligned}$$

(35)

By the DA, $F_x'(x^k,z^k ; d^k) < 0$ and since $\alpha ^k > 0$ for each $k \ge 1$ by Remark 6, we infer from (35) that $F(x^k + \alpha ^k d^k,z^k) < F(x^k,z^k)$. By construction, we have $F(x^{k+1},z^{k+1}) \le F(x^k + \alpha ^k d^k,z^k) < F(x^k,z^k).$. By the monotonicity $F(x^{k+1},z^{k+1}) < F(x^k,z^k)$ and F being bounded from below on $X \times Z$, we have $\lim _{k \rightarrow \infty } F(x^k,z^k) = \bar{F} > -\infty $. Therefore,

$$\begin{aligned} \lim _{k \rightarrow \infty } F\left( x^{k+1},z^{k+1}\right) - F\left( x^k,z^k\right) = 0, \end{aligned}$$

which implies

$$\begin{aligned} \lim _{k \rightarrow \infty } F\left( x^k + \alpha ^k d^k,z^k\right) - F\left( x^k,z^k\right) = 0. \end{aligned}$$

(36)

We assume for sake of contradiction that $\lim _{k \rightarrow \infty } (x^k,z^k) = (\overline{x},\overline{y})$ does not satisfy the stationarity condition (33). By GRA, we have that $\left\{ d^k \right\} $ is gradient related to $\left\{ x^k \right\} $; that is,

$$\begin{aligned} \limsup _{k \rightarrow \infty } F_x'\left( x^k,z^k ; d^k\right) < 0. \end{aligned}$$

(37)

Thus, it follows from (35)–(37) that $\lim _{k \rightarrow \infty } \alpha ^k = 0$.

Consequently, after a certain iteration $k \ge \bar{k}$, we can define $\left\{ \bar{\alpha }^k \right\} $, $\bar{\alpha }^k = \alpha ^k / \beta $, where $\bar{\alpha }^k \le 1$ for $k \ge \bar{k}$, and so we have

$$\begin{aligned} \sigma F_x'\left( x^k,z^k ; d^k\right) < \frac{F\left( x^k + \bar{\alpha }^k d^k,z^k\right) - F\left( x^k,z^k\right) }{\bar{\alpha }^k}. \end{aligned}$$

(38)

Since F is continuously differentiable, the mean value theorem may be applied to the right-hand side of (38) to get

$$\begin{aligned} \sigma F_x'\left( x^k,z^k ; d^k\right) < F_x'\left( x^k +\widetilde{\alpha }^k d^k, z^k ; d^k\right) , \end{aligned}$$

(39)

for some $\widetilde{\alpha }^k \in [0,\overline{\alpha }^k]$.

Again, using the assumption $\limsup _{k \rightarrow \infty } F_x'(x^k,z^k ; d^k) < 0$, and also the compactness of $X - X$, we take a limit point $\overline{d}$ of $\left\{ d^k \right\} $, with its associated subsequence index set denoted by $\mathcal {K}$, such that $F_x'(\overline{x},\overline{z},\overline{d}) < 0$. Taking the limits over the subsequence indexed by $\mathcal {K}$, we have $\lim _{k \rightarrow \infty , k \in \mathcal {K}} F_x'(x^k,z^k ; d^k) = F_x'(\overline{x},\overline{z} ; \overline{d})$ and $\lim _{k \rightarrow \infty , k \in \mathcal {K}} F_x'(x^k +\widetilde{\alpha }^k d^k, z^k ; d^k) = F_x'(\overline{x},\overline{z} ; \overline{d})$. These two limits holds since (1) $(x,z) \mapsto F_x'(x,z;d)$ for each $d \in X - X$ is continuous and (2) $d \mapsto F_x'(x,z;d)$ is locally Lipschitz continuous for each $(x,z) \in X \times Z$ (e.g., Proposition 2.1.1 of [60]); these two facts together imply that $(x,z;d) \mapsto F_x'(x,z;d)$ is continuous. Then, inequality (39) becomes in the limit as $k \rightarrow \infty $, $k \in \mathcal {K}$,

$$\begin{aligned} \sigma F_x'(\overline{x},\overline{z} ; \overline{d}) \le F_x'(\overline{x},\overline{z} ; \overline{d}) \quad \Longrightarrow \quad 0 \le (1-\sigma ) F_x'(\overline{x},\overline{z} ; \overline{d}). \end{aligned}$$

Since $(1-\sigma ) > 0$ and $F_x'(\overline{x},\overline{z} ; \overline{d})<0$, we have a contradiction. Thus, $\overline{x}$ must satisfy the stationary condition (33). $\square $

Remark 7

Noting that $F_x'(x^k,z^k;d^k) = \nabla _x F(x^k,z^k) ^\top d^k$ under the assumption of continuous differentiability of F, one means of constructing $\left\{ d^k \right\} $ is as follows:

$$\begin{aligned} d^k \leftarrow {{\mathrm{argmin}}}_d \left\{ \nabla _x F(x^k,z^k)^\top d : d \in X-\left\{ x^k \right\} \right\} . \end{aligned}$$

(40)

Lemma 7

Given sequence $\left\{ (x^k,z^k) \right\} $ with $\lim _{k \rightarrow \infty } (x^k,z^k) = (\overline{x},\overline{z})$, let each $d^k$, $k \ge 1$, be generated as in (40). Then $\left\{ d^k \right\} $ is gradient related to $\left\{ x^k \right\} $.

Proof

By the construction of $d^k$, $k \ge 1$, we have

$$\begin{aligned} F_x'(x^k,z^k ; d^k) \le F_x'(x^k,z^k ; d) \quad \forall \; d \in X-\left\{ x^k \right\} . \end{aligned}$$

Taking the limit, we have

$$\begin{aligned} \limsup _{k \rightarrow \infty } F_x'(x^k, z^k ; d^k) \le \limsup _{k \rightarrow \infty } F_x'(x^k, z^k ; d) \le F_x'(\overline{x}, \overline{z}; d) \quad \forall \; d \in X-\left\{ \overline{x} \right\} , \end{aligned}$$

where the last inequality follows from the upper semicontinuity of the function $(x,z,d) \mapsto F_x'(x,z;d)$, which holds in our setting due, primarily, to Proposition 2.1.1 (b) of [60] given that F is assumed to be convex and continuous on $\mathbb {R}^n$. Taking

$$\begin{aligned} \overline{d} \in {{\mathrm{argmin}}}_d \left\{ F_x'(\overline{x},\overline{z} ; d) : d \in X - \left\{ \overline{x} \right\} \right\} , \end{aligned}$$

we have by the assumed nonstationarity that $F_x'(\overline{x},\overline{z};\overline{d}) < 0$. Thus, $\limsup _{k \rightarrow \infty } F_x'(x^k, z^k ; d^k) < 0$, and so GRA holds. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Boland, N., Christiansen, J., Dandurand, B. et al. A parallelizable augmented Lagrangian method applied to large-scale non-convex-constrained optimization problems. Math. Program. 175, 503–536 (2019). https://doi.org/10.1007/s10107-018-1253-9

Download citation

Received: 20 November 2016
Accepted: 22 February 2018
Published: 01 March 2018
Issue Date: 01 May 2019
DOI: https://doi.org/10.1007/s10107-018-1253-9

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A parallelizable augmented Lagrangian method applied to large-scale non-convex-constrained optimization problems

Abstract

Access this article

Similar content being viewed by others

Parallelizing the dual revised simplex method

Decomposition and Parallelization of Linear Programming Algorithms

Asynchronous parallel algorithms for nonconvex optimization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (pdf 87 KB)

A Technical lemmas for establishing optimal convergence of SDM-GS

Remark 6

Lemma 6

Proof

Remark 7

Lemma 7

Proof

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

A parallelizable augmented Lagrangian method applied to large-scale non-convex-constrained optimization problems

Abstract

Access this article

Similar content being viewed by others

Parallelizing the dual revised simplex method

Decomposition and Parallelization of Linear Programming Algorithms

Asynchronous parallel algorithms for nonconvex optimization

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Electronic supplementary material

Supplementary material 1 (pdf 87 KB)

A Technical lemmas for establishing optimal convergence of SDM-GS

A Technical lemmas for establishing optimal convergence of SDM-GS

Remark 6

Lemma 6

Proof

Remark 7

Lemma 7

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation