Large-scale quasi-Newton trust-region methods with low-dimensional linear equality constraints

Brust, Johannes J.; Marcia, Roummel F.; Petra, Cosmin G.

doi:10.1007/s10589-019-00127-4

Large-scale quasi-Newton trust-region methods with low-dimensional linear equality constraints

Published: 05 September 2019

Volume 74, pages 669–701, (2019)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

672 Accesses
6 Citations
Explore all metrics

Abstract

We propose two limited-memory BFGS (L-BFGS) trust-region methods for large-scale optimization with linear equality constraints. The methods are intended for problems where the number of equality constraints is small. By exploiting the structure of the quasi-Newton compact representation, both proposed methods solve the trust-region subproblems nearly exactly, even for large problems. We derive theoretical global convergence results of the proposed algorithms, and compare their numerical effectiveness and performance on a variety of large-scale problems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A new simple model trust-region method with generalized Barzilai-Borwein parameter for large-scale optimization

Article 29 July 2016

A limited-memory trust-region method for nonlinear optimization with many equality constraints

Article 10 March 2023

Two globally convergent nonmonotone trust-region methods for unconstrained optimization

Article 26 March 2015

References

Boyd, S., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2007)
MATH Google Scholar
Brust, J.J., Burdakov, O., Erway, J.B., Marcia, R.F., Yuan, Y.X.: Shape-changing L-SR1 trust-region methods. Technical Report 2016-2, Department of Mathematics, Wake Forest University (2016)
Brust, J.J., Burdakov, O.P., Erway, J.B., Marcia, R.F.: Dense initializations for limited-memory quasi-Newton methods. Comput. Optim. Appl. 74(1), 121–142 (2019). https://doi.org/10.1007/s10589-019-00112-x
Article MathSciNet MATH Google Scholar
Brust, J.J., Erway, J.B., Marcia, R.F.: On solving L-SR1 trust-region subproblems. Comput. Optim. Appl. 66(2), 245–266 (2017)
Article MathSciNet Google Scholar
Burdakov, O., Gong, L., Yuan, Y.X., Zikrin, S.: On efficiently combining limited memory and trust-region techniques. Math. Program. Comput. 9, 101–134 (2016)
Article MathSciNet Google Scholar
Burdakov, O., Martinez, J., Pilotta, E.: A limited-memory multipoint symmetric secant method for bound constrained optimization. Ann. Oper. Res. 117, 51–70 (2002)
Article MathSciNet Google Scholar
Burke, J.V., Wiegmann, A., Xu, L.: Limited memory BFGS updating in a trust-region framework. Technical Report, University of Washington (1996)
Byrd, R.H., Gilbert, J.C., Nocedal, J.: A trust region method based on interior point techniques for nonlinear programming. Math. Program. Ser. A 89, 149–185 (2000)
Article MathSciNet Google Scholar
Byrd, R.H., Hribar, M., Nocedal, J.: An interior point algorithm for large-scale nonlinear programming. SIAM J. Optim. 9, 877–900 (1999)
Article MathSciNet Google Scholar
Byrd, R.H., Nocedal, J., Schnabel, R.B.: Representations of quasi-Newton matrices and their use in limited-memory methods. Math. Program. 63, 129–156 (1994)
Article MathSciNet Google Scholar
Celis, M., Dennis Jr., J., Tapia, R.: A trust region strategy for equality constrained optimization. Technical Report 84-1, Mathematical Sciences Department, Rice University (1984)
Coleman, T., Branch, M.A., Grace, A.: Optimization Toolbox for Use with MATLAB. MathWorks, Natick (1999)
Google Scholar
Coleman, T., Verma, A.: A preconditioned conjugate gradient approach to linear equality constrained minimization. Comput. Optim. Appl. 20, 61–72 (2001)
Article MathSciNet Google Scholar
Conn, A.R., Gould, N.I.M., Toint, P.L.: Trust-Region Methods. SIAM, Philadelphia (2000)
Book Google Scholar
DeGuchy, O., Erway, J.B., Marcia, R.F.: Compact representation of the full Broyden class of quasi-Newton updates. Numer Linear Algebra Appl 25(5), e2186 (2018)
Article MathSciNet Google Scholar
Dolan, E., Moré, J.: Benchmarking optimization software with performance profiles. Math. Program. 91, 201–213 (2002)
Article MathSciNet Google Scholar
Erway, J.B., Marcia, R.F.: Algorithm 943: MSS: MATLAB software for L-BFGS trust-region subproblems for large-scale optimization. ACM Trans. Math. Softw. 40(4), 28:1–28:12 (2014). https://doi.org/10.1145/2616588
Article MathSciNet MATH Google Scholar
Hager, W.W.: Updating the inverse of a matrix. SIAM Rev. 31(2), 221–239 (1989)
Article MathSciNet Google Scholar
Lalee, M., Nocedal, J., Plantenga, T.: On the implementation of an algorithm for large-scale equality constrained optimization. SIAM J. Optim. 8(3), 682–706 (1998)
Article MathSciNet Google Scholar
Moré, J.J., Sorensen, D.C.: Computing a trust region step. SIAM J. Sci. Stat. Comput. 4, 553–572 (1983)
Article MathSciNet Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, New York (2006)
MATH Google Scholar
Powell, M., Yuan, Y.: A trust region algorithm for equality constrained optimization. Math. Program. 49, 189–211 (1991)
Article MathSciNet Google Scholar
Saunders, M.A.: PDCO: Primal-dual interior method for convex objectives (2002–2015). http://www.stanford.edu/group/SOL/software/pdco.html. Accessed 21 June 2018
Steihaug, T.: The conjugate gradient method and trust regions in large scale optimization. SIAM J. Numer. Anal. 20, 626–637 (1983)
Article MathSciNet Google Scholar
Vardi, A.: A trust region algorithm for equality constrained minimization: convergence properties and implementation. SIAM J. Numer. Anal. 22(3), 575–591 (1985)
Article MathSciNet Google Scholar
Wächter, A., Biegler, L.T.: On the implementation of an interior-point filter line-search algorithm for large-scale nonlinear programming. Math. Program. 106, 25–57 (2006)
Article MathSciNet Google Scholar
Waltz, R., Morales, J., Nocedal, J., Orban, D.: An interior algorithm for nonlinear optimization that combines line search and trust region steps. SIAM. J. Optim. 9, 877–900 (1999)
Article MathSciNet Google Scholar
Yuan, Y.X.: Trust region algorithms for constrained optimization. Technical report, State Key Laboratory of Scientific and Engineering Computing, Beijing
Zhijiang, S.: RSQP toolbox for MATLAB (2006). https://www.mathworks.com/matlabcentral/fileexchange/13046-rsqp-toolbox-for-matlab. Accessed 21 June 2018

Download references

Author information

Authors and Affiliations

Argonne National Laboratory, Lemont, IL, USA
Johannes J. Brust
University of California Merced, Merced, CA, USA
Roummel F. Marcia
Lawrence Livermore National Laboratory, Livermore, CA, USA
Cosmin G. Petra

Authors

Johannes J. Brust
View author publications
You can also search for this author in PubMed Google Scholar
Roummel F. Marcia
View author publications
You can also search for this author in PubMed Google Scholar
Cosmin G. Petra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Johannes J. Brust.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. R. Marcia’s research is partially supported by NSF Grant IIS 1741490. C. Petra also acknowledges support from the LDRD Program of Lawrence Livermore National Laboratory under Projects 16-ERD-025 and 17-SI-005.

J. J. Brust was formerly at University of California Merced, Merced, CA.

Appendix A

Notation

Section 2: Background

\({\mathbf {s}}_{k-1}={\mathbf {x}}_{k} - {\mathbf {x}}_{k-1} \qquad \qquad \qquad \qquad \quad {\mathbf {S}}_k =\displaystyle [ {\mathbf {s}}_{k-l} \,\, \cdots \,\, {\mathbf {s}}_{k-1}]\)
\({\mathbf {y}}_{k-1}=\nabla f({\mathbf {x}}_{k}) - \nabla f({\mathbf {x}}_{k-1}) \qquad \quad {\mathbf {Y}}_k = \displaystyle \left[ {\mathbf {y}}_{k-l} \,\, \cdots \,\, {\mathbf {y}}_{k-1}\right] \)
\({\mathbf {S}}_k^T {\mathbf {Y}}_k={\mathbf {L}}_k + {\mathbf {T}}_k \qquad \qquad \qquad \qquad \quad {\mathbf {D}}_k=\text {diag}({\mathbf {S}}_k^T{\mathbf {Y}}_k)\)
\({\mathbf {B}}^{{(k)}}_0={\gamma _{k}} {\mathbf {I}}_n \qquad \qquad \qquad \qquad \qquad \qquad {\mathbf {H}}_k=\mathbf {B}^{-1}_k\)
\(\gamma _{k}={\mathbf {y}}_{k-1}^T {\mathbf {y}}_{k-1} / {\mathbf {y}}_{k-1}^T {\mathbf {s}}_{k-1} \qquad \qquad \,\, \delta _{k} = {1/\gamma _k}\)
\({\mathbf {B}}_k=\gamma _k {\mathbf {I}}_n + \widehat{\varvec{\Psi }}_k \widehat{\varvec{\Xi }}_k \widehat{\varvec{\Psi }}_k^T \qquad \qquad \qquad \widehat{\varvec{\Psi }}_k = [ {\mathbf {S}}_k \ \ {\mathbf {Y}}_k]\)
\({\mathbf {H}}_k=\delta _k {\mathbf {I}}_n + \widehat{\varvec{\Psi }}_k \widehat{{\mathbf {M}}}_k \widehat{\varvec{\Psi }}_k^T\)
\(\widehat{\varvec{\Xi }}_k = \displaystyle \gamma _k\left[ \begin{array}{cc} - {\mathbf {S}}_k^T {\mathbf {S}}_k &{} - {\mathbf {L}}_k \\ - {\mathbf {L}}_k^T &{} \ \ \gamma _k {\mathbf {D}}_k \end{array}\right] ^{-1}\)
\(\widehat{{\mathbf {M}}}_k = -(\gamma _k^{2} \widehat{\varvec{\Xi }}_k^{-1} + \gamma _k\widehat{\varvec{\Psi }}_k^T \widehat{\varvec{\Psi }}_k)^{-1}\)

Section 3: Trust-Region Subproblem Solution without an Inequality Constraint

\({\mathbf {K}}= \displaystyle \left[ \begin{array}{c c} {\mathbf {B}}_k &{} {\mathbf {A}}^T \\ {\mathbf {A}} &{} {\mathbf {0}} \end{array} \right] \) \( \begin{array}{l} \varvec{\Omega }_k = ( {\mathbf {A}} {\mathbf {B}}_k^{-1} {\mathbf {A}}^T )^{-1} \\ \varvec{\Psi }_k =[ {\mathbf {A}}^T \ \ \ \widehat{\varvec{\Psi }}_k ]\end{array}\)
\(\mathbf {K}^{-1} = \displaystyle \left[ \begin{array}{c c}{\mathbf {B}}_k^{-1} \!- {\mathbf {B}}_k^{-1}{\mathbf {A}}^T \varvec{\Omega }_k {\mathbf {A}} {\mathbf {B}}_k^{-1} \ \ &{} {\mathbf {B}}_k^{-1}{\mathbf {A}}^T \varvec{\Omega }_k \\ ({\mathbf {B}}_k^{-1}{\mathbf {A}}^T \varvec{\Omega }_k)^T \ \ &{} -\varvec{\Omega }_k \\ \end{array} \right] \)
\({\mathbf {V}}_k = {\mathbf {B}}_k^{-1} \!-\! {\mathbf {B}}_k^{-1}{\mathbf {A}}^T \varvec{\Omega }_k {\mathbf {A}} {\mathbf {B}}_k^{-1}\)
\({\mathbf {V}}_k = \delta _k {\mathbf {I}}_n + \varvec{\Psi }_k {\mathbf {M}}_k \varvec{\Psi }_k^T\)
\({\mathbf {W}}_k = {\mathbf {B}}_k^{-1}{\mathbf {A}}^T \varvec{\Omega }_k\)
\({\mathbf {M}}_k = \displaystyle \left[ \begin{array}{c c} - \delta _k^2 \varvec{\Omega }_k &{} - \delta _k\varvec{\Omega }_k {\mathbf {C}}_k\\ - \delta _k {\mathbf {C}}_k^T \varvec{\Omega }_k &{} \ \widehat{{\mathbf {M}}}_k \!-\! {\mathbf {C}}_k^T\varvec{\Omega }_k{\mathbf {C}}_k \end{array} \right] \)
\({\mathbf {C}}_k = {\mathbf {A}}\widehat{\varvec{\Psi }}_k\widehat{{\mathbf {M}}}_k \)

Section 4: Trust-Region Subproblem Solution with an \(\ell _2\)-Norm Inequality Constraint

\({\mathbf {H}}_k(\sigma ) = ({\mathbf {B}}_k + \sigma {\mathbf {I}})^{-1} \qquad \qquad \qquad \qquad \quad \,\,\, {\mathbf {H}}_k = {\mathbf {H}}_k(0) \)
\(\varvec{\Phi }_k(\sigma ) = {\mathbf {I}}_n - {\mathbf {A}}^T\varvec{\Omega }_k(\sigma ) {\mathbf {A}}{\mathbf {H}}_k(\sigma ) \qquad \qquad \varvec{\Phi }_k = \varvec{\Phi }_k(0) \)
\({\mathbf {H}}_k(\sigma ) = \frac{1}{\gamma _k + \sigma }{\mathbf {I}}_n + \widehat{\varvec{\Psi }}_k\widehat{{\mathbf {M}}}_k(\sigma )\widehat{\varvec{\Psi }}_k^T\)
\(\varvec{\Omega }_k(\sigma ) = ({\mathbf {A}}{\mathbf {H}}_k(\sigma ){\mathbf {A}}^T)^{-1}\)
\(\widehat{{\mathbf {M}}}_k(\sigma ) = -\big ((\gamma _k + \sigma )^2 \widehat{\varvec{\Xi }}_k^{-1} + (\gamma _k + \sigma )\widehat{\varvec{\Psi }}_k^T\widehat{\varvec{\Psi }}_k \big )^{-1}\)
\({\mathbf {V}}_k(\sigma ) = {\mathbf {H}}_k(\sigma ) - {\mathbf {H}}_k(\sigma ){\mathbf {A}}^T \varvec{\Omega }_k(\sigma ){\mathbf {A}} {\mathbf {H}}_k(\sigma )\)
\({\mathbf {V}}_k(\sigma ) = {\mathbf {H}}_k(\sigma ) \varvec{\Phi }_k(\sigma )\)
\({\mathbf {s}}(\sigma ) = - {\mathbf {H}}_k(\sigma ) \varvec{\Phi }_k(\sigma ) {\mathbf {g}}_k\)
\({\mathbf {s}}'(\sigma ) = - {\mathbf {H}}_k(\sigma ) \varvec{\Phi }_k(\sigma ) {\mathbf {s}}(\sigma )\)

Section 5: Trust-Region Subproblem Solution with a Shape-Changing Norm Inequality Constraint

\({\mathbf {U}}_k = -\varvec{\Psi }_k{\mathbf {M}}_k \varvec{\Psi }_k^T\)
\({\mathbf {A}}^T = \mathbf {Q}_{1} \mathbf {R}_{1}\qquad \qquad \qquad \qquad \qquad \qquad \,\, \mathbf {Q}_{1} \mathbf {Q}_{1}^T = {\mathbf {A}}^T ({\mathbf {A}} {\mathbf {A}}^T)^{-1} {\mathbf {A}} \)
\({\mathbf {P}} = {\mathbf {I}}_n - {\mathbf {A}}^T ({\mathbf {A}} {\mathbf {A}}^T)^{-1} {\mathbf {A}} \qquad \qquad \qquad {\mathbf {P}}\widehat{\varvec{\Psi }}_k = \widehat{{\mathbf {Q}}}_2\widehat{{\mathbf {R}}}_2 \)
\(\widehat{{\mathbf {V}}}_2\widehat{\varvec{\Lambda }}_k \widehat{{\mathbf {V}}}^T_2 = \widehat{{\mathbf {R}}}_2 (\widehat{{\mathbf {M}}}_k-{\mathbf {C}}_k^T\varvec{\Omega }_k{\mathbf {C}}_k) \widehat{{\mathbf {R}}}^T_2 \)
\(\mathbf {Q}_{2} = \widehat{{\mathbf {Q}}}_2 \widehat{{\mathbf {V}}}_2\)
\({\mathbf {Q}} = \left[ \mathbf {Q}_{1} \, \mathbf {Q}_{2} \, \mathbf {Q}_{3} \right] \)
\( \mathbf {Q}_{\parallel } = \left[ \mathbf {Q}_{1} \, \mathbf {Q}_{2} \right] \qquad \qquad \qquad \qquad \qquad \quad \mathbf {Q}_{\perp } = \mathbf {Q}_{3} \)
\({\mathbf {z}} = \left[ \begin{array}{c} \mathbf {z}_{1} \\ \mathbf {z}_{2} \\ \mathbf {z}_{3} \end{array} \right] \qquad \qquad \qquad \qquad \qquad \qquad \quad {\mathbf {s}} = {\mathbf {Q}} {\mathbf {z}}\)
\( \mathbf {z}_{\parallel } = \mathbf {z}_{2} = \mathbf {Q}_{2}^T {\mathbf {s}} \qquad \qquad \qquad \qquad \qquad \quad \mathbf {z}_{\perp } = \mathbf {z}_{3} = \mathbf {Q}_{3}^T {\mathbf {s}} \)
\( \mathbf {g}_{\parallel } = \mathbf {Q}_{2}^T {\mathbf {g}}_k \quad \qquad \qquad \qquad \qquad \qquad \qquad \mathbf {g}_{\perp } = \mathbf {Q}_{\perp }^T {\mathbf {g}}_k \)
\({\mathbf {V}}_k = {\mathbf {Q}} \varvec{\Lambda } {\mathbf {Q}}^T = \left[ \mathbf {Q}_{1} \, \mathbf {Q}_{2} \, \mathbf {Q}_{3} \right] \left[ \begin{array}{c c c} {\mathbf {0}} &{} \\ &{} \delta _k {\mathbf {I}} - \widehat{\varvec{\Lambda }}_k&{} \\ &{} &{} \delta _k {\mathbf {I}} \end{array} \right] \left[ \begin{array}{c} \mathbf {Q}_{1}^T \\ \mathbf {Q}_{2}^T \\ \mathbf {Q}_{3}^T \\ \end{array} \right] \)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Brust, J.J., Marcia, R.F. & Petra, C.G. Large-scale quasi-Newton trust-region methods with low-dimensional linear equality constraints. Comput Optim Appl 74, 669–701 (2019). https://doi.org/10.1007/s10589-019-00127-4

Download citation

Received: 31 July 2018
Published: 05 September 2019
Issue Date: December 2019
DOI: https://doi.org/10.1007/s10589-019-00127-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Large-scale quasi-Newton trust-region methods with low-dimensional linear equality constraints

Abstract

Access this article

Similar content being viewed by others

A new simple model trust-region method with generalized Barzilai-Borwein parameter for large-scale optimization

A limited-memory trust-region method for nonlinear optimization with many equality constraints

Two globally convergent nonmonotone trust-region methods for unconstrained optimization

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Large-scale quasi-Newton trust-region methods with low-dimensional linear equality constraints

Abstract

Access this article

Similar content being viewed by others

A new simple model trust-region method with generalized Barzilai-Borwein parameter for large-scale optimization

A limited-memory trust-region method for nonlinear optimization with many equality constraints

Two globally convergent nonmonotone trust-region methods for unconstrained optimization

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix A

Appendix A

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation