Skip to main content
Log in

Parallelized hybrid optimization methods for nonsmooth problems using NOMAD and linesearch

  • Published:
Computational and Applied Mathematics Aims and scope Submit manuscript

Abstract

Two parallelized hybrid methods are presented for single-function optimization problems with side constraints. The optimization problems are difficult not only due to possible existence of local minima and nonsmoothness of functions, but also due to the fact that objective function and constraint values for a solution vector can only be obtained by querying a black box whose execution requires considerable computational effort. Examples are optimization problems in Engineering where objective function and constraint values are computed via complex simulation programs, and where local minima exist and smoothness of functions is not assured. The hybrid methods consist of the well-known method NOMAD and two new methods called DENCON and DENPAR that are based on the linesearch scheme CS-DFN. The hybrid methods compute for each query a set of solution vectors that are evaluated in parallel. The hybrid methods have been tested on a set of difficult optimization problems produced by a certain seeding scheme for multiobjective optimization. We compare computational results with solution by NOMAD, DENCON, and DENPAR as stand-alone methods. It turns out that among the stand-alone methods, NOMAD is significantly better than DENCON and DENPAR. However, the hybrid methods are definitely better than NOMAD.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Abramson MA, Audet C, Couture G, Dennis Jr JE, Le Digabel S, Tribes C (2014) The NOMAD project. http://www.gerad.ca/nomad

  • Abramson MA, Audet C, Dennis JE Jr, Le Digabel S (2009) Orthomads: a deterministic mads instance with orthogonal directions. SIAM J Optim 20(2):948–966

    Article  MathSciNet  MATH  Google Scholar 

  • Audet C, Dennis JE Jr (2006) Mesh adaptive direct search algorithms for constrained optimization. SIAM J Optim 17(1):188–217

    Article  MathSciNet  MATH  Google Scholar 

  • Audet C, Dennis JE Jr, Le Digabel S (2008) Parallel space decomposition of the mesh adaptive direct search algorithm. SIAM J Optim 19(3):1150–1170

    Article  MathSciNet  MATH  Google Scholar 

  • Bratley P, Fox B (1988) Algorithm 659: implementing Sobol’s quasirandom sequence generator. ACM Trans Math Softw 14(1):88–100

    Article  MathSciNet  MATH  Google Scholar 

  • Clarke FH (1983) Optimization and nonsmooth analysis. Wiley, New York

    MATH  Google Scholar 

  • Dennis JE Jr, Torczon V (1991) Direct search methods on parallel machines. SIAM J Optim 1(4):448–474

    Article  MathSciNet  MATH  Google Scholar 

  • Di Pillo G, Grippo L, Lucidi S (1993) A smooth method for the finite minimax problem. J Glob Optim 60:187–214

    MathSciNet  MATH  Google Scholar 

  • Fasano G, Liuzzi G, Lucidi S, Rinaldi F (2014) A linesearch-based derivative-free approach for nonsmooth constrained optimization. SIAM J Optim 24(3):959–992

    Article  MathSciNet  MATH  Google Scholar 

  • García-Palomares UM, Rodríguez JF (2002) New sequential and parallel derivative-free algorithms for unconstrained minimization. SIAM J Optim 13(1):79–96

    Article  MathSciNet  MATH  Google Scholar 

  • García-Palomares UM, García-Urrea IJ, Rodríguez-Hernández PS (2013) On sequential and parallel non-monotone derivative-free algorithms for box constrained optimization. Optim Methods Softw 28(6):1233–1261

    Article  MathSciNet  MATH  Google Scholar 

  • Goldberg DE (1989) Genetic algorithms in search, optimization and machine learning. Addison-Wesley, Boston

    MATH  Google Scholar 

  • Gray GA, Kolda TG (2006) Algorithm 856: Appspack 4.0: asynchronous parallel pattern search for derivative-free optimization. ACM Trans Math Softw 32(3):485–507

    Article  MathSciNet  MATH  Google Scholar 

  • Griffin JD, Kolda TG, Lewis RM (2008) Asynchronous parallel generating set search for linearly constrained optimization. SIAM J Sci Comput 30(4):1892–1924

    Article  MathSciNet  MATH  Google Scholar 

  • Grippo L, Lampariello F, Lucidi S (1986) A nonmonotone line search technique for Newton’s method. SIAM J Numer Anal 23(4):707–716

    Article  MathSciNet  MATH  Google Scholar 

  • Halton JH (1960) On the efficiency of certain quasi-random sequences of points in evaluating multi-dimensional integrals. Numer Math 2:84–90

    Article  MathSciNet  MATH  Google Scholar 

  • Hough PD, Kolda TG, Torczon VJ (2001) Asynchronous parallel pattern search for nonlinear optimization. SIAM J Sci Comput 23(1):134–156

    Article  MathSciNet  MATH  Google Scholar 

  • Kolda TG (2005) Revisiting asynchronous parallel pattern search for nonlinear optimization. SIAM J Optim 16(2):563–586

    Article  MathSciNet  MATH  Google Scholar 

  • Kolda TG, Torczon V (2004) On the convergence of asynchronous parallel pattern search. SIAM J Optim 14(4):939–964

    Article  MathSciNet  MATH  Google Scholar 

  • Laguna M, Molina J, Pérez F, Caballero R, Hernández-Díaz AG (2009) The challenge of optimizing expensive black boxes: a scatter search/rough set theory approach. J Oper Res Soc 61:53–67

    Article  MATH  Google Scholar 

  • Le Digabel S (2011) Algorithm 909: NOMAD: nonlinear optimization with the MADS algorithm. ACM Trans Math Softw 37(4):1–15

    Article  MathSciNet  MATH  Google Scholar 

  • Meza JC, Oliva RA, Hough PD, Williams PJ (2007) Opt++: an object oriented toolkit for nonlinear optimization. ACM Trans Math Softw 33(2):12

    Article  Google Scholar 

  • Moré JJ, Wild SM (2009) Benchmarking derivative-free optimization algorithms. SIAM J Optim 20(1):172–191

    Article  MathSciNet  MATH  Google Scholar 

  • Ponstein J (1967) Seven kinds of convexity. SIAM Rev 9(1):115–119

    Article  MathSciNet  MATH  Google Scholar 

  • Shetty CM, Bazaraa MS (1979) Nonlinear programming: theory and algorithms. Wiley, Massachusetts

    MATH  Google Scholar 

  • Sobol I (1977) Uniformly distributed sequences with an additional uniform property. USSR Comput Math Math Phys 16:236–242

    Article  MATH  Google Scholar 

  • Truemper K (in review) Simple seeding of evolutionary algorithms for hard multiobjective minimization problems

Download references

Acknowledgements

We are thankful to an anonymous reviewer for helpful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to G. Liuzzi.

Additional information

Communicated by Ernesto G. Birgin.

Appendices

Appendix

More comparisons

We show that the iteration budget of 2000 evaluation generally is a good choice.

To start, we define six candidate iteration budgets, of size 500, 1000, 2000, 3000, 4000, and 5000. That choice is based on earlier runs involving hard engineering problems where 1000 and 2000 turned out to be good choices.

We construct the graphs showing the data and performance profiles for NOMAD, NOMAD–DENCON, and NOMAD–DENPAR for the selected six iteration budgets. Space restrictions prevent inclusion of the graphs; they are available at http://www.iasi.cnr.it/~liuzzi/hybridDF/.

Table 6 NOMAD, NOMAD–DENCON, and NOMAD–DENPAR, iter. budget = 500

From the graphs, we derive tables structured like Table 1, which summarize the relative size of areas under the profile curves of the various graphs. The tables are included in the appendix as Tables , 7, 8, 9, 10 and . The percentages in the tables show that NOMAD always produces the smallest area and thus has worst performance except for two cases in Table 6 where the problems with \(n=20\) variables are solved on 64 processors, and where NOMAD dominates NOMAD–DENCON but still is dominated by NOMAD–DENPAR.

The results of Tables , 7, 8, 9, 10 and are reassuring in the sense that the dominance of NOMAD–DENCON and NOMAD–DENPAR over NOMAD does not depend on a critical choice of the iteration budget. There is an intuitive explanation for this results. During each iteration, the selected method may terminate due to convergence conditions and thus may not reach the iteration budget. In such a case, a larger value of the iteration budget will induce the same behavior. A corollary of this argument is that we always should prefer component methods that have well-justified convergence conditions, i.e., conditions based on a sound convergence analysis, as is the case for the three methods NOMAD, DENCON, and DENPAR selected here.

We interrupt the analysis of Tables 6, 7, 8, 9, 10 and 11 and look at the efficiency of the hybrid methods under parallelization, again considering the six iteration budgets. The relevant results are compiled in Table 3 for NOMAD–DENCON and in Table 4 for NOMAD–DENPAR. The interpretation is analogous to that for Table 2. For each problem subset, the efficiency ratios \(s/(64\cdot c)\) are very similar regardless of the iteration budget. For example, when the entire problem set is solved, then for NOMAD–DENCON the ratio ranges from 18 to 25%, and for NOMAD–DENPAR from 21 to 27%. These results indicate that the efficiency under parallelization is not very sensitive with respect to the iteration budget.

After this general investigation of the impact of the iteration budget, we turn to the problem of deciding a best iteration budget. For the selection, we compute the profile graphs and tables evaluating the performance of NOMAD–DENCON under the six iteration budgets. The graphs are omitted here; they are available at http://www.iasi.cnr.it/~liuzzi/hybridDF/. The graphs are summarized in Tables 12 and 13 for NOMAD–DENCON and Tables 14 and 15 for NOMAD–DENPAR.

We analyze the tables. Table 12 has the percentages for NOMAD–DENCON when running on a single processor. The bold numbers, indicating maximum as before, occur mostly in the rows for an iteration budget of 2000. Thus, that number is a good choice. Table 13, which covers NOMAD–DENCON and 64 processors, results in the same conclusion.

But Tables 14 and 15 do not lead to such clear-cut choices. Nevertheless, in Table 14 half of the bold entries occur in rows for the iteration budget of 2000. But Table 15 provides no significant insight. The reason becomes clear when we look at the corresponding graph, provided at http://www.iasi.cnr.it/~liuzzi/hybridDF/. That graph shows the profile curves bunched together, and we can accept an iteration budget of 2000 as reasonable choice, in tune with the decisions deduced from the other tables.

We conclude that an iteration budget of 2000 is a reasonable choice for both NOMAD–DENCON and NOMAD–DENPAR. For that choice, Table 8 indicates improvement percentages of NOMAD–DENCON ranging from 32 to 68% over NOMAD when the entire problem set is solved on a single processor, and ranging from 33 to 65% for 64 processors. The corresponding percentages for NOMAD–DENPAR are 12 to 33% for single processor and 29 to 44% for 64 processors. Thus, NOMAD–DENCON is clearly better than NOMAD–DENPAR on a single processor as well as on 64 processors.

Table 7 NOMAD, NOMAD–DENCON, and NOMAD–DENPAR, iter. budget = 1000
Table 8 NOMAD, NOMAD–DENCON, and NOMAD–DENPAR, iter. budget = 2000
Table 9 NOMAD, NOMAD–DENCON, and NOMAD–DENPAR, iter. budget = 3000
Table 10 NOMAD, NOMAD–DENCON, and NOMAD–DENPAR, iter. budget \(=\) 4000
Table 11 NOMAD, NOMAD–DENCON, and NOMAD–DENPAR, iter. budget = 5000
Table 12 NOMAD–DENCON on single processor, with various iter. budgets
Table 13 NOMAD–DENCON on 64 processors, with various iter. budgets
Table 14 NOMAD–DENPAR on single processor, with various iter. budgets
Table 15 NOMAD–DENPAR on 64 processors, with various iter. budgets

Convergence analysis for DENPAR

This section is devoted to the convergence analysis of Algorithm DENPAR. In this section, we consider problem (3) where the objective function is assumed to be quasi-convex Shetty and Bazaraa (1979), i.e., for each \(x,y\in \mathbb {R}^n\),

$$\begin{aligned} f(\lambda x + (1-\lambda )y) \le \max \{f(x),f(y)\}, \qquad \text{ for } \text{ each }\ \lambda \in [0,1]. \end{aligned}$$

Algorithm DENPAR is based on DFN\(_{simple}\) from Fasano et al. (2014). We review the latter scheme, beginning with the definition of Clarke stationarity (see, e.g., Clarke 1983).

Definition B.1

(Clarke Stationarity) Given the unconstrained problem \(\min _{x\in \mathbb {R}^n} f(x)\), a point \(\bar{x}\) is a Clarke stationary point if \(0\in \partial f(\bar{x})\), where \(\partial f( x)=\{s\in \mathbb {R}^n : f^{Cl}( x; d)\ge d^Ts,\ \forall d \in \mathbb {R}^n \}\) is the generalized gradient of f at x, and

$$\begin{aligned} f^{Cl}(x; d) = \limsup _{\begin{subarray}{l}y\rightarrow x, t\downarrow 0\end{subarray}} \frac{f(y+t d) - f(y)}{t}. \end{aligned}$$
(7)

We also need the definition of dense subsequences.

Definition B.2

(Dense subsequence) Let K be an infinite subset of indices (possibly \(K=\{0,1,\dots \}\)). The subsequence of normalized directions \(\{d_k\}_K\) is said to be dense in the unit sphere S(0, 1), if for any \(\bar{D}\in S(0,1)\) and for any \(\epsilon > 0\) there exists an index \(k\in K\) such that \(\Vert d_k-\bar{D}\Vert \le \epsilon \).

Here is a summary of DFN\(_{simple}\) for the solution of Problem (3).

figure e

In Algorithm DFN\(_{simple}\), a predefined sequence of search directions \(\{d_k\}\) is used. Then, the behavior of the function f(x) along the direction \(d_k\) is investigated. If the direction (or its opposite) is deemed a good direction, in the sense that sufficient decrease can be obtained along it, then a sufficiently large step size is computed by means of the Expansion Step procedure. On the other hand, if neither \(d_k\) nor \(-d_k\) are good direction, then the tentative step size is reduced by a constant factor.

Note that, in Algorithm DFN\(_{simple}\), considerable freedom is left for the selection of the next iterate \(x_{k+1}\) once the new point \(\tilde{x}_k\) has been computed. More specifically, the next iterate \(x_{k+1}\) is only required to satisfy inequality \(f(x_{k+1})\le f(\tilde{x}_k)\). This can trivially be satisfied by setting \(x_{k+1} \leftarrow \tilde{x}_k\). However, more sophisticated selection strategies can be implemented. For instance, \(x_{k+1}\) might be defined by minimizing suitable approximating models of the objective function, thus possibly improving the efficiency of the overall scheme. As we shall see, this freedom offered by DFN\(_{simple}\) is particularly useful for our purposes.

Next we describe a parallelized version of DFN\(_{simple}\) called DEN\(_{check}\).

1.1 Algorithm DEN\(_{check}\)

Here is a summary of the algorithm.

figure f

At every iteration of DEN\(_{check}\), an orthonormal basis is formed starting from the given direction \(\hat{d}_k\). First, the behavior of the objective function along directions \(d_k^1,\ldots ,d_k^n\) is investigated starting from the same point \(x_k\). This produces step sizes \(\alpha _k^i\ge 0\) and \(\tilde{\alpha }_k^i > 0\), \(i=1,\ldots ,n\). Provided that \(\alpha _k^i > 0\) for at least an index i, the index \(j_M\) is computed and \(\tilde{x}_k \leftarrow x_k + \alpha _k^{j_M}d_k^{j_M}\), that is \(\tilde{x}_k\) is the point that produces the worst improvement for the objective function.

Additional computation is carried out if \(\sum _{i=1}^n\alpha _k^i >0\). In particular, the step sizes obtained by the n linesearches are combined together to define the convex combination point \(x_c\). Then \(f(x_c)\) is compared with \(f(\tilde{x}_k)\). If \(f(x_c)\) improves upon the latter value, then \(x_{k+1}\) is set equal to \(x_c\), otherwise \(x_{k+1}\) is set equal to the previously computed \(\tilde{x}_k\). The reader may wonder about the choice of \(\tilde{x}_k\). We specify it here to get a theoretical scheme that can be readily converted to the more efficient DENPAR.

In the following proposition, we show that Algorithm DEN\(_{check}\) inherits the convergence properties of the sequential code DFN\(_{simple}\) by showing that DEN\(_{check}\) is a particular case of the latter method.

Proposition B.3

Let \(\{x_k\}\) be the sequence produced by Algorithm DEN\(_{check}\). Let \(\bar{x}\) be any limit point of \(\{x_k\}\) and K be the subset of indices such that

$$\begin{aligned} \lim _{k\rightarrow \infty ,k \in K}x_k = \bar{x}. \end{aligned}$$

If the subsequence \(\{d_k\}_K\) is dense in the unit sphere (see Definition B.2), then \(\bar{x}\) is Clarke stationary for problem (3) (see Definition B.1).

Proof

We prove the proposition by showing that DEN\(_{check}\) is an instance of DFN\(_{simple}\). To this aim, let us consider the last step of Algorithm DFN\(_{simple}\), namely where the next iterate \(x_{k+1}\) is defined. As it can be seen, in Algorithm DFN\(_{simple}\), \(x_{k+1}\) is required to satisfy the condition \(f(x_{k+1})\le f(\tilde{x}_k)\). Note that \(\tilde{x}_k\) of DFN\(_{simple}\) corresponds to the point \(\tilde{x}_k\) of DEN\(_{check}\). Indeed, if \(\sum _{i=1}^n\alpha _k^i > 0\), then \(\tilde{x}_k= x_k + \alpha _k^{j_M} d_k^{j_M}\) and \(d_k = d_k^{j_M}\). Otherwise, \(\tilde{x}_k = x_k\) and \(d_k = \hat{d}_k\).

Now, let us consider an iteration k of Algorithm DEN\(_{check}\). By the instructions of the algorithm, one of the following cases occurs.

  1. (i)

    \(\sum _{i=1}^n\alpha _k^i > 0\) and \(f(x_c) \le f(\tilde{x}_k)\);

  2. (ii)

    \(\sum _{i=1}^n\alpha _k^i > 0\) and \(f(x_c) > f(\tilde{x}_k)\);

  3. (iii)

    \(\sum _{i=1}^n\alpha _k^i = 0\).

In case (i), \(x_{k+1}\leftarrow x_c\) and the iteration of DEN\(_{check}\) is like an iteration of DFN\(_{simple}\) with \(d_k = d_k^{j_M}\) and where \(x_{k+1}\) is chosen as point \(x_c\).

In case (ii), \(x_{k+1}\leftarrow \tilde{x}_k\) and the iteration of DEN\(_{check}\) is like an iteration of DFN\(_{simple}\) with \(d_k = d_k^{j_M}\) and where \(x_{k+1}\) is set equal to \(\tilde{x}_k\).

Finally, in case (iii), \(x_{k+1}\leftarrow x_k\) and the iteration of DEN\(_{check}\) is like an iteration of DFN\(_{simple}\) with \(d_k = \hat{d}_k\) and where sufficient improvement cannot be obtained both along \(d_k\) and \(-d_k\).

Hence, any iteration of DEN\(_{check}\) can be viewed as a particular iteration of DFN\(_{simple}\). This establishes the proposition. \(\square \)

DENPAR is derived from DEN\(_{check}\) by replacing lines 20–23 by \(x_{k+1}\leftarrow x_c\). Effectively, that replacement assumes that the inequality \(f(x_c) \le f(\tilde{x}_k)\) of line 20 is satisfied. This is indeed the case since f(x) is assumed to be quasi-convex.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liuzzi, G., Truemper, K. Parallelized hybrid optimization methods for nonsmooth problems using NOMAD and linesearch. Comp. Appl. Math. 37, 3172–3207 (2018). https://doi.org/10.1007/s40314-017-0505-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40314-017-0505-2

Keywords

Mathematics Subject Classification

Navigation