Total Memory Optimiser: Proof of Concept and Compromises

Clerc, Maurice

doi:10.1007/978-3-319-50307-3_1

Maurice Clerc¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10103))

Included in the following conference series:

International Conference on Swarm Intelligence Based Optimization

461 Accesses

Abstract

For most usual optimisation problems, the Nearer is Better assumption is true (in probability). Classical iterative algorithms take this property into account, either explicitly or implicitly, by forgetting some information collected during the process, assuming it is not useful any more. However, when the property is not globally true, i.e. for deceptive problems, it may be necessary to keep all the sampled points and their values, and to exploit this increasing amount of information. Such a basic Total Memory Optimiser is presented here. We experimentally show that this technique can outperform classical methods on small deceptive problems. As it gets very computing time expensive when the dimension of the problem increases, a few compromises are suggested to speed it up.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 34.99; Price excludes VAT (USA)

Softcover Book: USD 44.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This is an open question: is it possible to define a Lipschitzian function without any plateau but with a negative NisB correlation?.
2.
It may seems contradictory with the fact that we want to cope with problems for which the NisB correlation is globally negative. Even in such a case, it is sometimes locally positive, and more and more when the number of points increases.
3.
We would like to thank Dr. Saber Elsayed for providing the MATLAB$^{\copyright }$ code of GA-MPC.
4.
In fact, we used a more recent and better version 3.62, downloaded from https://www.lri.fr/~hansen/cmaes_inmatlab.html.

References

Beyhaghi, P., Cavaglieri, D., Bewley, T.: Delaunay-based derivative-free optimization via global surrogates, part I: linear constraints. J. Glob. Optim., 1–52 (2015)
Google Scholar
Clerc, M.: When Nearer is Better, p. 19 (2007). https://hal.archives-ouvertes.fr/hal-00137320
Clerc, M.: Guided Randomness in Optimization. ISTE (International Scientific and Technical Encyclopedia). Wiley (2015)
Google Scholar
de Berg, M., Cheong, O., van Kreveld, M., Overmars, M.: Computational Geometry. Springer, Heidelberg (2008)
Book MATH Google Scholar
Elsayed, S.M., Sarker, R.A., Essam, D.L.: GA with a New Multi-Parent Crossover for Solving IEEE-CEC2011 Competition Problems (2011)
Google Scholar
Glover, F., Laguna, M.: Tabu Search. Kluwer Academic Publishers (1997)
Google Scholar
Han, Z.-H., Zhang, K.-S.: Surrogate-based optimization. INTECH Open Access Publisher (2012)
Google Scholar
Hansen, N.: The CMA Evolution Strategy: A Tutorial. Technical report (2009)
Google Scholar
Omran, M.G.H., Clerc, M.: An adaptive population-based simplex method for continuous optimization. Int. J. Swarm Intell. Res. 7(4), 22–49 (2016)
Article Google Scholar
Weise, T., Zapf, M., Chiong, R., Nebro, A.J.: Why is optimization difficult? In: Kacprzyk, J., Chiong, R. (eds.) Nature-Inspired Algorithms for Optimisation. SCI, vol. 193, pp. 1–50. Springer, Heidelberg (2009)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Independent Consultant, Groisy, France
Maurice Clerc

Authors

Maurice Clerc
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maurice Clerc .

Editor information

Editors and Affiliations

Université Paris-Est Créteil, Vitry-sur-Seine, France
Patrick Siarry
LMIA-INRIA Grand Est, Université de Haute-Alsace, Mulhouse, France
Lhassane Idoumghar
Université de Haute-Alsace, Mulhouse, France
Julien Lepagnot

A Appendix

1.1 A.1 Problem Definitions

Alpine. For dimension D, the search space is $\left[ 0,4D\right] ^{D}$. Function f is defined as:

$$\begin{aligned} f\left( x_{1},\ldots ,x_{D}\right) =\sum _{d=1}^{D}\left| x_{d,\delta }\sin \left( x_{d,\delta }\right) \right| +0.1\left| x_{d,\delta }\right| \end{aligned}$$

(2)

with $x_{d,\delta }=x_{d}-\delta d$. In this case, we have simply chosen $\delta =1$. This parameter serves to ensure that the minimum is not at the centre of the search space or on a diagonal. The problem is multimodal and non-separable.

Deceptive 1 (Flash). The search space is $\left[ 0,1\right] $. Function f is defined as:

$$\begin{aligned} \left\{ \begin{array}{ccl} x\le 2c_{1} &{} \rightarrow &{} f\left( x\right) =c_{2}\\ 2c1<x\le 3c_{1} &{} \rightarrow &{} f\left( x\right) =c_{2}-\frac{c_{2}}{c_{1}}\left( x-2c_{1}\right) \\ 3c_{1}<x\le 4c_{1} &{} \rightarrow &{} f\left( x\right) =\frac{2c_{2}}{c_{1}}\left( x-3c_{1}\right) \\ 4c_{1}<x\le 5c_{1} &{} \rightarrow &{} f\left( x\right) =2c_{2}-\frac{c_{2}}{c_{1}}\left( x-4c_{1}\right) \\ x\ge 5c_{1} &{} \rightarrow &{} f\left( x\right) =c_{2} \end{array}\right) \end{aligned}$$

(3)

with, in this case, $c_{1}=0.1$ and $c_{2}=0.5$. The problem is unimodal, but with plateaus.

Deceptive 2 (Comb). The search space is $\left[ 0,10\right] $. Function f is defined as:

$$\begin{aligned} f(x)=\min \left( c_{2},1+\sin \left( c_{1}x\right) +\frac{x}{c_{1}}\right) \end{aligned}$$

(4)

with, in this case, $c_{1}=10$ and $c_{2}=1$. The problem is multimodal, but with plateaus.

Deceptive 3 (Brush). The search space is $\left[ 0,10\right] ^{2}$. Function f is defined as:

$$\begin{aligned} f\left( x_{1},x_{2}\right) =\min \left( c_{2},\sum _{d=1}^{2}\left| x_{d}\sin \left( x_{d}\right) \right| +\frac{x_{d}}{c1}\right) \end{aligned}$$

(5)

with, in this case, $c_{1}=10$ and $c_{2}=1$. The problem is multimodal and non-separable.

1.2 A.2 When we Know Nothing, the Middle is the Best Choice

On the Search Space. Let $x^{*}$ be the solution point (we do suppose here it is unique). If we sample x, the error is $\left\| x-x^{*}\right\| $. At the very beginning, as we know nothing, the probability distribution of $x^{*}$ is uniform on the search space. Roughly speaking, it can be anywhere with the same probability. So, we have the sample x in order to minimise the risk given by

$$\begin{aligned} r=\intop _{x^{*}\in S}\left\| x-x^{*}\right\| \end{aligned}$$

(6)

Let us solve it for $D=1$, and $S=\left[ x_{min},x_{max}\right] $. We have

$$ \begin{array}{ccl} r &{} = &{} \int _{u=x_{min}}^{x}\left( x-u\right) du+\int _{u=x}^{x_{max}}\left( u-x\right) du\\ &{} = &{} \left[ xu-\frac{u^{2}}{2}\right] _{u=x_{min}}^{x}+\left[ \frac{u^{2}}{2}-xu\right] _{u=x}^{_{x_{max}}}\\ &{} = &{} x^{2}-\left( x_{max}+x_{min}\right) x+\frac{x_{max}^{2}+x_{min}^{2}}{2} \end{array} $$

And the minimum of this parabola is given by

$$ x=\frac{x_{max}+x_{min}}{2} $$

For $D>1$ the proof is technically more complicated (a possible way is to use recurrence and projections), but the result is the same: the less risky first point is the centre of the search space.

On the Value Space. The same reasoning can be applied to the value space, when we do not make any hypothesis like say a positive local NisB correlation, and when we know the lower and upper bounds of the values, respectively $y_{low}$ and $y_{up}$. On any unknown position of the search space the distribution of the possible values on $\left[ y_{low},y_{up}\right] $ is uniform and therefore the less risky is, again, the middle, i.e. $\frac{y_{low}+y_{up}}{2}$.

1.3 A.3 Variability of a Landscape

We use here a specific definition, which is different from the definition of variance in probability theory. Let f be a numerical function on the search space S. What we call variability on a subspace s of S is the quantity

$$\begin{aligned} v=\intop _{s^{4}}\left| \frac{f(x_{2})-f(x_{1})}{\left\| x_{2}-x_{1}\right\| }-\frac{f(x_{3})-f(x_{1})}{\left\| x_{3}-x_{1}\right\| }\right| \end{aligned}$$

(7)

where $\left\{ x_{1},x_{2},x_{3}\right\} $ is an element of $s^{3}=s\otimes s\otimes s$ (Euclidean product), under the constraint $x_{3}=x_{1}+\lambda \left( x_{2}-x_{1}\right) $ or, equivalently, $\left( x_{2}-x_{1}\right) \times \left( x_{3}-x_{2}\right) =0$ (cross product). The definition may seem to be complicated, but it just means that in any direction the slope of the landscape is constantly the same.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Clerc, M. (2016). Total Memory Optimiser: Proof of Concept and Compromises. In: Siarry, P., Idoumghar, L., Lepagnot, J. (eds) Swarm Intelligence Based Optimization. ICSIBO 2016. Lecture Notes in Computer Science(), vol 10103. Springer, Cham. https://doi.org/10.1007/978-3-319-50307-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-50307-3_1
Published: 25 November 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50306-6
Online ISBN: 978-3-319-50307-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Total Memory Optimiser: Proof of Concept and Compromises

Abstract

Access this chapter

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix

A Appendix

1.1 A.1 Problem Definitions

1.2 A.2 When we Know Nothing, the Middle is the Best Choice

1.3 A.3 Variability of a Landscape

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation