Parallelized hybrid optimization methods for nonsmooth problems using NOMAD and linesearch

Liuzzi, G.; Truemper, K.

doi:10.1007/s40314-017-0505-2

Parallelized hybrid optimization methods for nonsmooth problems using NOMAD and linesearch

Published: 08 September 2017

Volume 37, pages 3172–3207, (2018)
Cite this article

Computational and Applied Mathematics Aims and scope Submit manuscript

218 Accesses
3 Citations
Explore all metrics

Abstract

Two parallelized hybrid methods are presented for single-function optimization problems with side constraints. The optimization problems are difficult not only due to possible existence of local minima and nonsmoothness of functions, but also due to the fact that objective function and constraint values for a solution vector can only be obtained by querying a black box whose execution requires considerable computational effort. Examples are optimization problems in Engineering where objective function and constraint values are computed via complex simulation programs, and where local minima exist and smoothness of functions is not assured. The hybrid methods consist of the well-known method NOMAD and two new methods called DENCON and DENPAR that are based on the linesearch scheme CS-DFN. The hybrid methods compute for each query a set of solution vectors that are evaluated in parallel. The hybrid methods have been tested on a set of difficult optimization problems produced by a certain seeding scheme for multiobjective optimization. We compare computational results with solution by NOMAD, DENCON, and DENPAR as stand-alone methods. It turns out that among the stand-alone methods, NOMAD is significantly better than DENCON and DENPAR. However, the hybrid methods are definitely better than NOMAD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallel strategies for Direct Multisearch

Article 30 July 2022

Alternating criteria search: a parallel large neighborhood search algorithm for mixed integer programs

Article 08 August 2017

Parallel Cell Mapping for Unconstrained Multi-Objective Optimization Problems

References

Abramson MA, Audet C, Couture G, Dennis Jr JE, Le Digabel S, Tribes C (2014) The NOMAD project. http://www.gerad.ca/nomad
Abramson MA, Audet C, Dennis JE Jr, Le Digabel S (2009) Orthomads: a deterministic mads instance with orthogonal directions. SIAM J Optim 20(2):948–966
Article MathSciNet MATH Google Scholar
Audet C, Dennis JE Jr (2006) Mesh adaptive direct search algorithms for constrained optimization. SIAM J Optim 17(1):188–217
Article MathSciNet MATH Google Scholar
Audet C, Dennis JE Jr, Le Digabel S (2008) Parallel space decomposition of the mesh adaptive direct search algorithm. SIAM J Optim 19(3):1150–1170
Article MathSciNet MATH Google Scholar
Bratley P, Fox B (1988) Algorithm 659: implementing Sobol’s quasirandom sequence generator. ACM Trans Math Softw 14(1):88–100
Article MathSciNet MATH Google Scholar
Clarke FH (1983) Optimization and nonsmooth analysis. Wiley, New York
MATH Google Scholar
Dennis JE Jr, Torczon V (1991) Direct search methods on parallel machines. SIAM J Optim 1(4):448–474
Article MathSciNet MATH Google Scholar
Di Pillo G, Grippo L, Lucidi S (1993) A smooth method for the finite minimax problem. J Glob Optim 60:187–214
MathSciNet MATH Google Scholar
Fasano G, Liuzzi G, Lucidi S, Rinaldi F (2014) A linesearch-based derivative-free approach for nonsmooth constrained optimization. SIAM J Optim 24(3):959–992
Article MathSciNet MATH Google Scholar
García-Palomares UM, Rodríguez JF (2002) New sequential and parallel derivative-free algorithms for unconstrained minimization. SIAM J Optim 13(1):79–96
Article MathSciNet MATH Google Scholar
García-Palomares UM, García-Urrea IJ, Rodríguez-Hernández PS (2013) On sequential and parallel non-monotone derivative-free algorithms for box constrained optimization. Optim Methods Softw 28(6):1233–1261
Article MathSciNet MATH Google Scholar
Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Boston
MATH Google Scholar
Gray GA, Kolda TG (2006) Algorithm 856: Appspack 4.0: asynchronous parallel pattern search for derivative-free optimization. ACM Trans Math Softw 32(3):485–507
Article MathSciNet MATH Google Scholar
Griffin JD, Kolda TG, Lewis RM (2008) Asynchronous parallel generating set search for linearly constrained optimization. SIAM J Sci Comput 30(4):1892–1924
Article MathSciNet MATH Google Scholar
Grippo L, Lampariello F, Lucidi S (1986) A nonmonotone line search technique for Newton’s method. SIAM J Numer Anal 23(4):707–716
Article MathSciNet MATH Google Scholar
Halton JH (1960) On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Numer Math 2:84–90
Article MathSciNet MATH Google Scholar
Hough PD, Kolda TG, Torczon VJ (2001) Asynchronous parallel pattern search for nonlinear optimization. SIAM J Sci Comput 23(1):134–156
Article MathSciNet MATH Google Scholar
Kolda TG (2005) Revisiting asynchronous parallel pattern search for nonlinear optimization. SIAM J Optim 16(2):563–586
Article MathSciNet MATH Google Scholar
Kolda TG, Torczon V (2004) On the convergence of asynchronous parallel pattern search. SIAM J Optim 14(4):939–964
Article MathSciNet MATH Google Scholar
Laguna M, Molina J, Pérez F, Caballero R, Hernández-Díaz AG (2009) The challenge of optimizing expensive black boxes: a scatter search/rough set theory approach. J Oper Res Soc 61:53–67
Article MATH Google Scholar
Le Digabel S (2011) Algorithm 909: NOMAD: nonlinear optimization with the MADS algorithm. ACM Trans Math Softw 37(4):1–15
Article MathSciNet MATH Google Scholar
Meza JC, Oliva RA, Hough PD, Williams PJ (2007) Opt++: an object oriented toolkit for nonlinear optimization. ACM Trans Math Softw 33(2):12
Article Google Scholar
Moré JJ, Wild SM (2009) Benchmarking derivative-free optimization algorithms. SIAM J Optim 20(1):172–191
Article MathSciNet MATH Google Scholar
Ponstein J (1967) Seven kinds of convexity. SIAM Rev 9(1):115–119
Article MathSciNet MATH Google Scholar
Shetty CM, Bazaraa MS (1979) Nonlinear programming: theory and algorithms. Wiley, Massachusetts
MATH Google Scholar
Sobol I (1977) Uniformly distributed sequences with an additional uniform property. USSR Comput Math Math Phys 16:236–242
Article MATH Google Scholar
Truemper K (in review) Simple seeding of evolutionary algorithms for hard multiobjective minimization problems

Download references

Acknowledgements

We are thankful to an anonymous reviewer for helpful comments and suggestions.

Author information

Authors and Affiliations

Istituto di Analisi dei Sistemi ed Informatica “A. Ruberti”, Consiglio Nazionale delle Ricerche, Rome, Italy
G. Liuzzi
University of Texas at Dallas, Dallas, TX, USA
K. Truemper

Authors

G. Liuzzi
View author publications
You can also search for this author in PubMed Google Scholar
K. Truemper
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to G. Liuzzi.

Additional information

Communicated by Ernesto G. Birgin.

Appendices

Appendix

More comparisons

We show that the iteration budget of 2000 evaluation generally is a good choice.

To start, we define six candidate iteration budgets, of size 500, 1000, 2000, 3000, 4000, and 5000. That choice is based on earlier runs involving hard engineering problems where 1000 and 2000 turned out to be good choices.

We construct the graphs showing the data and performance profiles for NOMAD, NOMAD–DENCON, and NOMAD–DENPAR for the selected six iteration budgets. Space restrictions prevent inclusion of the graphs; they are available at http://www.iasi.cnr.it/~liuzzi/hybridDF/.

Table 6 NOMAD, NOMAD–DENCON, and NOMAD–DENPAR, iter. budget = 500

Full size table

From the graphs, we derive tables structured like Table 1, which summarize the relative size of areas under the profile curves of the various graphs. The tables are included in the appendix as Tables , 7, 8, 9, 10 and . The percentages in the tables show that NOMAD always produces the smallest area and thus has worst performance except for two cases in Table 6 where the problems with $n=20$ variables are solved on 64 processors, and where NOMAD dominates NOMAD–DENCON but still is dominated by NOMAD–DENPAR.

The results of Tables , 7, 8, 9, 10 and are reassuring in the sense that the dominance of NOMAD–DENCON and NOMAD–DENPAR over NOMAD does not depend on a critical choice of the iteration budget. There is an intuitive explanation for this results. During each iteration, the selected method may terminate due to convergence conditions and thus may not reach the iteration budget. In such a case, a larger value of the iteration budget will induce the same behavior. A corollary of this argument is that we always should prefer component methods that have well-justified convergence conditions, i.e., conditions based on a sound convergence analysis, as is the case for the three methods NOMAD, DENCON, and DENPAR selected here.

We interrupt the analysis of Tables 6, 7, 8, 9, 10 and 11 and look at the efficiency of the hybrid methods under parallelization, again considering the six iteration budgets. The relevant results are compiled in Table 3 for NOMAD–DENCON and in Table 4 for NOMAD–DENPAR. The interpretation is analogous to that for Table 2. For each problem subset, the efficiency ratios $s/(64\cdot c)$ are very similar regardless of the iteration budget. For example, when the entire problem set is solved, then for NOMAD–DENCON the ratio ranges from 18 to 25%, and for NOMAD–DENPAR from 21 to 27%. These results indicate that the efficiency under parallelization is not very sensitive with respect to the iteration budget.

After this general investigation of the impact of the iteration budget, we turn to the problem of deciding a best iteration budget. For the selection, we compute the profile graphs and tables evaluating the performance of NOMAD–DENCON under the six iteration budgets. The graphs are omitted here; they are available at http://www.iasi.cnr.it/~liuzzi/hybridDF/. The graphs are summarized in Tables 12 and 13 for NOMAD–DENCON and Tables 14 and 15 for NOMAD–DENPAR.

We analyze the tables. Table 12 has the percentages for NOMAD–DENCON when running on a single processor. The bold numbers, indicating maximum as before, occur mostly in the rows for an iteration budget of 2000. Thus, that number is a good choice. Table 13, which covers NOMAD–DENCON and 64 processors, results in the same conclusion.

But Tables 14 and 15 do not lead to such clear-cut choices. Nevertheless, in Table 14 half of the bold entries occur in rows for the iteration budget of 2000. But Table 15 provides no significant insight. The reason becomes clear when we look at the corresponding graph, provided at http://www.iasi.cnr.it/~liuzzi/hybridDF/. That graph shows the profile curves bunched together, and we can accept an iteration budget of 2000 as reasonable choice, in tune with the decisions deduced from the other tables.

We conclude that an iteration budget of 2000 is a reasonable choice for both NOMAD–DENCON and NOMAD–DENPAR. For that choice, Table 8 indicates improvement percentages of NOMAD–DENCON ranging from 32 to 68% over NOMAD when the entire problem set is solved on a single processor, and ranging from 33 to 65% for 64 processors. The corresponding percentages for NOMAD–DENPAR are 12 to 33% for single processor and 29 to 44% for 64 processors. Thus, NOMAD–DENCON is clearly better than NOMAD–DENPAR on a single processor as well as on 64 processors.

Table 7 NOMAD, NOMAD–DENCON, and NOMAD–DENPAR, iter. budget = 1000

Full size table

Table 8 NOMAD, NOMAD–DENCON, and NOMAD–DENPAR, iter. budget = 2000

Full size table

Table 9 NOMAD, NOMAD–DENCON, and NOMAD–DENPAR, iter. budget = 3000

Full size table

Table 10 NOMAD, NOMAD–DENCON, and NOMAD–DENPAR, iter. budget $=$ 4000

Full size table

Table 11 NOMAD, NOMAD–DENCON, and NOMAD–DENPAR, iter. budget = 5000

Full size table

Table 12 NOMAD–DENCON on single processor, with various iter. budgets

Full size table

Table 13 NOMAD–DENCON on 64 processors, with various iter. budgets

Full size table

Table 14 NOMAD–DENPAR on single processor, with various iter. budgets

Full size table

Table 15 NOMAD–DENPAR on 64 processors, with various iter. budgets

Full size table

Convergence analysis for DENPAR

This section is devoted to the convergence analysis of Algorithm DENPAR. In this section, we consider problem (3) where the objective function is assumed to be quasi-convex Shetty and Bazaraa (1979), i.e., for each $x,y\in \mathbb {R}^n$,

$$\begin{aligned} f(\lambda x + (1-\lambda )y) \le \max \{f(x),f(y)\}, \qquad \text{ for } \text{ each }\ \lambda \in [0,1]. \end{aligned}$$

Algorithm DENPAR is based on DFN$_{simple}$ from Fasano et al. (2014). We review the latter scheme, beginning with the definition of Clarke stationarity (see, e.g., Clarke 1983).

Definition B.1

(Clarke Stationarity) Given the unconstrained problem $\min _{x\in \mathbb {R}^n} f(x)$, a point $\bar{x}$ is a Clarke stationary point if $0\in \partial f(\bar{x})$, where $\partial f( x)=\{s\in \mathbb {R}^n : f^{Cl}( x; d)\ge d^Ts,\ \forall d \in \mathbb {R}^n \}$ is the generalized gradient of f at x, and

$$\begin{aligned} f^{Cl}(x; d) = \limsup _{\begin{subarray}{l}y\rightarrow x, t\downarrow 0\end{subarray}} \frac{f(y+t d) - f(y)}{t}. \end{aligned}$$

(7)

We also need the definition of dense subsequences.

Definition B.2

(Dense subsequence) Let K be an infinite subset of indices (possibly $K=\{0,1,\dots \}$). The subsequence of normalized directions $\{d_k\}_K$ is said to be dense in the unit sphere S(0, 1), if for any $\bar{D}\in S(0,1)$ and for any $\epsilon > 0$ there exists an index $k\in K$ such that $\Vert d_k-\bar{D}\Vert \le \epsilon $.

Here is a summary of DFN$_{simple}$ for the solution of Problem (3).

In Algorithm DFN$_{simple}$, a predefined sequence of search directions $\{d_k\}$ is used. Then, the behavior of the function f(x) along the direction $d_k$ is investigated. If the direction (or its opposite) is deemed a good direction, in the sense that sufficient decrease can be obtained along it, then a sufficiently large step size is computed by means of the Expansion Step procedure. On the other hand, if neither $d_k$ nor $-d_k$ are good direction, then the tentative step size is reduced by a constant factor.

Note that, in Algorithm DFN$_{simple}$, considerable freedom is left for the selection of the next iterate $x_{k+1}$ once the new point $\tilde{x}_k$ has been computed. More specifically, the next iterate $x_{k+1}$ is only required to satisfy inequality $f(x_{k+1})\le f(\tilde{x}_k)$. This can trivially be satisfied by setting $x_{k+1} \leftarrow \tilde{x}_k$. However, more sophisticated selection strategies can be implemented. For instance, $x_{k+1}$ might be defined by minimizing suitable approximating models of the objective function, thus possibly improving the efficiency of the overall scheme. As we shall see, this freedom offered by DFN$_{simple}$ is particularly useful for our purposes.

Next we describe a parallelized version of DFN$_{simple}$ called DEN$_{check}$.

1.1 Algorithm DEN$_{check}$

Here is a summary of the algorithm.

At every iteration of DEN$_{check}$, an orthonormal basis is formed starting from the given direction $\hat{d}_k$. First, the behavior of the objective function along directions $d_k^1,\ldots ,d_k^n$ is investigated starting from the same point $x_k$. This produces step sizes $\alpha _k^i\ge 0$ and $\tilde{\alpha }_k^i > 0$, $i=1,\ldots ,n$. Provided that $\alpha _k^i > 0$ for at least an index i, the index $j_M$ is computed and $\tilde{x}_k \leftarrow x_k + \alpha _k^{j_M}d_k^{j_M}$, that is $\tilde{x}_k$ is the point that produces the worst improvement for the objective function.

Additional computation is carried out if $\sum _{i=1}^n\alpha _k^i >0$. In particular, the step sizes obtained by the n linesearches are combined together to define the convex combination point $x_c$. Then $f(x_c)$ is compared with $f(\tilde{x}_k)$. If $f(x_c)$ improves upon the latter value, then $x_{k+1}$ is set equal to $x_c$, otherwise $x_{k+1}$ is set equal to the previously computed $\tilde{x}_k$. The reader may wonder about the choice of $\tilde{x}_k$. We specify it here to get a theoretical scheme that can be readily converted to the more efficient DENPAR.

In the following proposition, we show that Algorithm DEN$_{check}$ inherits the convergence properties of the sequential code DFN$_{simple}$ by showing that DEN$_{check}$ is a particular case of the latter method.

Proposition B.3

Let $\{x_k\}$ be the sequence produced by Algorithm DEN$_{check}$. Let $\bar{x}$ be any limit point of $\{x_k\}$ and K be the subset of indices such that

$$\begin{aligned} \lim _{k\rightarrow \infty ,k \in K}x_k = \bar{x}. \end{aligned}$$

If the subsequence $\{d_k\}_K$ is dense in the unit sphere (see Definition B.2), then $\bar{x}$ is Clarke stationary for problem (3) (see Definition B.1).

Proof

We prove the proposition by showing that DEN$_{check}$ is an instance of DFN$_{simple}$. To this aim, let us consider the last step of Algorithm DFN$_{simple}$, namely where the next iterate $x_{k+1}$ is defined. As it can be seen, in Algorithm DFN$_{simple}$, $x_{k+1}$ is required to satisfy the condition $f(x_{k+1})\le f(\tilde{x}_k)$. Note that $\tilde{x}_k$ of DFN$_{simple}$ corresponds to the point $\tilde{x}_k$ of DEN$_{check}$. Indeed, if $\sum _{i=1}^n\alpha _k^i > 0$, then $\tilde{x}_k= x_k + \alpha _k^{j_M} d_k^{j_M}$ and $d_k = d_k^{j_M}$. Otherwise, $\tilde{x}_k = x_k$ and $d_k = \hat{d}_k$.

Now, let us consider an iteration k of Algorithm DEN$_{check}$. By the instructions of the algorithm, one of the following cases occurs.

(i)
$\sum _{i=1}^n\alpha _k^i > 0$ and $f(x_c) \le f(\tilde{x}_k)$;
(ii)
$\sum _{i=1}^n\alpha _k^i > 0$ and $f(x_c) > f(\tilde{x}_k)$;
(iii)
$\sum _{i=1}^n\alpha _k^i = 0$.

In case (i), $x_{k+1}\leftarrow x_c$ and the iteration of DEN$_{check}$ is like an iteration of DFN$_{simple}$ with $d_k = d_k^{j_M}$ and where $x_{k+1}$ is chosen as point $x_c$.

In case (ii), $x_{k+1}\leftarrow \tilde{x}_k$ and the iteration of DEN$_{check}$ is like an iteration of DFN$_{simple}$ with $d_k = d_k^{j_M}$ and where $x_{k+1}$ is set equal to $\tilde{x}_k$.

Finally, in case (iii), $x_{k+1}\leftarrow x_k$ and the iteration of DEN$_{check}$ is like an iteration of DFN$_{simple}$ with $d_k = \hat{d}_k$ and where sufficient improvement cannot be obtained both along $d_k$ and $-d_k$.

Hence, any iteration of DEN$_{check}$ can be viewed as a particular iteration of DFN$_{simple}$. This establishes the proposition. $\square $

DENPAR is derived from DEN$_{check}$ by replacing lines 20–23 by $x_{k+1}\leftarrow x_c$. Effectively, that replacement assumes that the inequality $f(x_c) \le f(\tilde{x}_k)$ of line 20 is satisfied. This is indeed the case since f(x) is assumed to be quasi-convex.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liuzzi, G., Truemper, K. Parallelized hybrid optimization methods for nonsmooth problems using NOMAD and linesearch. Comp. Appl. Math. 37, 3172–3207 (2018). https://doi.org/10.1007/s40314-017-0505-2

Download citation

Received: 18 October 2016
Revised: 06 August 2017
Accepted: 28 August 2017
Published: 08 September 2017
Issue Date: July 2018
DOI: https://doi.org/10.1007/s40314-017-0505-2

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Parallelized hybrid optimization methods for nonsmooth problems using NOMAD and linesearch

Abstract

Access this article

Similar content being viewed by others

Parallel strategies for Direct Multisearch

Alternating criteria search: a parallel large neighborhood search algorithm for mixed integer programs

Parallel Cell Mapping for Unconstrained Multi-Objective Optimization Problems

References

Acknowledgements