Abstract
Microarrays have become a key technology in experimental molecular biology. They allow a monitoring of gene expression for more than ten thousand genes in parallel producing huge amounts of data. In the exploration of transcriptional regulatory networks, an important task is to cluster gene expression data for identifying groups of genes with similar patterns.
In this paper, memetic algorithms (MAs) — genetic algorithms incorporating local search — are proposed for minimum sum-of-squares clustering. Two new mutation and recombination operators are studied within the memetic framework for clustering gene expression data. The memetic algorithms using a sophisticated recombination operator are shown to converge very quickly to (near-)optimum solutions. Furthermore, the MAs are shown to be superior to multi-start k-means clustering algorithms in both computation time and solution quality.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Zhang, M.: Large-scale Gene Expression Data Analysis: A New Challenge to Computational Biologists. Genome Research 9 (1999) 681–688
Brazma, A., Vilo, J.: Gene Expression Data Analysis. FEBS Letters 480 (2000) 17–24
Eisen, M., Spellman, P., Botstein, D., Brown, P.: Cluster Analysis and Display of Genome-wide Expression Patterns. In: Proceedings of the National Academy of Sciences, USA. Volume 95. (1998) 14863–14867
Ben-Dor, A., Shamir, R., Yakhini, Z.: Clustering Gene Expression Patterns. Journal of Computational Biology 6 (1999) 281–297
Tamayo, P., Slonim, D., Mesirov, J., Zhu, Q., Kitareewan, S., Dmitrovsky, E., Lander, E.S., Golub, T.R.: Interpreting Patterns of Gene Expression with Selforganizing Maps: Methods and Application to Hematopoietic Differentiation. In: Proceedings of the National Academy of Sciences, USA. Volume 96. (1999) 2907–2912
Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic Determination of Genetic Network Architecture. Nature Genetics 22 (1999) 281–285
Yeung, K., Haynor, D., Ruzzo, W.: Validating Clustering for Gene Expression Data. Bioinformatics 17 (2001) 309–318
Brucker, P.: On the Complexity of Clustering Problems. Lecture Notes in Economics and Mathematical Systems 157 (1978) 45–54
Merz, P., Freisleben, B.: Fitness Landscapes, Memetic Algorithms and Greedy Operators for Graph Bi-Partitioning. Evolutionary Computation 8 (2000) 61–91
Merz, P., Freisleben, B.: Fitness Landscapes and Memetic Algorithm Design. In Corne, D., Dorigo, M., Glover, F., eds.: New Ideas in Optimization. McGraw-Hill, London (1999) 245–260
Merz, P., Freisleben, B.: Fitness Landscape Analysis and Memetic Algorithms for the Quadratic Assignment Problem. IEEE Transactions on Evolutionary Computation 4 (2000) 337–352
Moscato, P.: On Evolution, Search, Optimization, Genetic Algorithms and Martial Arts: Towards Memetic Algorithms. Technical Report C3P Report 826, Caltech Concurrent Computation Program, California Institue of Technology (1989)
Moscato, P., Norman, M.G.: A Memetic Approach for the Traveling Salesman Problem Implementation of a Computational Ecology for Combinatorial Optimization on Message-Passing Systems. In Valero, M., Onate, E., Jane, M., Larriba, J.L., Suarez, B., eds.: Parallel Computing and Transputer Applications, Amsterdam, IOS Press (1992) 177–186
Dawkins, R.: The Selfish Gene. Oxford University Press (1976)
Freisleben, B., Merz, P.: New Genetic Local Search Operators for the Traveling Salesman Problem. In Voigt, H.M., Ebeling, W., Rechenberg, I., Schwefel, H.P., eds.: Proceedings of the 4th International Conference on Parallel Problem Solving from Nature-PPSN IV. Volume 1141 of Lecture Notes in Computer Science., Berlin, Springer (1996) 890–900
Merz, P.: Memetic Algorithms for Combinatorial Optimization Problems: Fitness Landscapes and E.ective Search Strategies. PhD thesis, Department of Electrical Engineering and Computer Science, University of Siegen, Germany (2000)
Gorges-Schleuter, M.: ASPARAGOS: An Asynchronous Parallel Genetic Optimization Strategy. In Schaffer, J.D., ed.: Proceedings of the Third International Conference on Genetic Algorithms, Morgan Kaufmann (1989) 422–427
Gorges-Schleuter, M.: Asparagos96 and the Traveling Salesman Problem. In: Proceedings of the 1997 IEEE International Conference on Evolutionary Computation, IEEE Press (1997) 171–174
Forgy, E.W.: Cluster Analysis of Multivariate Data: Effciency vs. Interpretability of Classifications. Biometrics 21 (1965) 768–769
MacQueen, J.: Some Methods of Classification and Analysis of Multivariate Observations. In: Proceedings of the Fifth Berkeley Symposium on Mathemtical Statistics and Probability. (1967) 281–297
Syswerda, G.: Uniform Crossover in Genetic Algorithms. In Schaffer, J.D., ed.: Proceedings of the 3rd International Conference on Genetic Algorithms, Morgan Kaufmann (1989) 2–9
Cho, R.J., Campbell, M.J., Winzeler, E.A., Conway, S., Wodicka, L., Wolfsberg, T.G., Gabrielian, A.E., Landsman, D., Lockhart, D.J., Davis, R.W.: A Genomewide Transcriptional Analysis of the Mitotic Cell Cycle. Molecular Cell 2 (1998) 65–73
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Merz, P., Zell, A. (2002). Clustering Gene Expression Profiles with Memetic Algorithms. In: Guervós, J.J.M., Adamidis, P., Beyer, HG., Schwefel, HP., Fernández-Villacañas, JL. (eds) Parallel Problem Solving from Nature — PPSN VII. PPSN 2002. Lecture Notes in Computer Science, vol 2439. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45712-7_78
Download citation
DOI: https://doi.org/10.1007/3-540-45712-7_78
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44139-7
Online ISBN: 978-3-540-45712-1
eBook Packages: Springer Book Archive