Abstract
Self-Organizing Map (SOM) developed by Kohonen’s group is an effective tool for clustering and visualizing high-dimensional complex data on a two-dimensional map. We previously modified the conventional SOM to genome informatics, making the learning process and resulting map independent of the order of data input. This BLSOM developed on the basis of batch-learning SOM became suitable for actualizing high-performance parallel-computing using high-performance supercomputers. This BLSOM revealed phylotype-specific characteristics of oligonucleotide frequencies occurred in their genome sequences and thus permitted clustering (self-organization) of genome fragments (e.g., 10 kb) according to phylotype without phylogenetic information during the BLSOM learning. Using a high-performance supercomputer “the Earth Simulator”, almost all prokaryotic, eukaryotic and viral sequences currently available could be classified according to phylotypes on a single map. Using this large-scale BLSOM, phylotypes of a large number of genomic fragments obtained by metagenome analyses of environmental samples could be predicted.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Kohonen, T.: Self-organized formation of topologically correct feature maps: Biol. Cybern. 43, 59–69 (1982)
Kohonen, T.: The self-organizing map. Proceedings of the IEEE 78, 1464–1480 (1990)
Kohonen, T., Oja, E., Simula, O., Visa, A., Kangas, J.: Engineering applications of the self-organizing map. Proc. IEEE 84, 1358–1384 (1996)
Kanaya, S., et al.: Gene classification by self-organization mapping of codon usage in bacteria with completely sequenced genome. Genome Inform 9, 369–371 (1998)
Abe, T., Kanaya, S., Kinouchi, M., Ichiba, Y., Kozuki, T., Ikemura, T.: Informatics for unveiling hidden genome signatures. Genome Res. 13, 693–702 (2003)
Kanaya, S., et al.: Analysis of codon usage diversity of bacterial genes with a self-organizing map (SOM) - characterization of horizontally transferred genes with emphasis on the E. coli O157 genome. Gene 276, 89–99 (2001)
Abe, T., et al.: Self-organizing map reveals sequence characteristics of 90 prokaryotic and eukaryotic genomes on a single map. In: WSOM 2003, pp. 95–100 (2003)
Abe, T., et al.: A large-scale Self-Organizing Map (SOM) constructed with the Earth Simulator unveils sequence characteristics of a wide range of eukaryotic genomes. In: WSOM 2005, pp. 187–194 (2005)
Abe, T., et al.: A large-scale Self-Organizing Map (SOM) unveils sequence characteristics of a wide range of eukaryote genomes. Gene 365, 27–34 (2006)
Abe, T., et al.: Sequences from almost all prokaryotic, eukaryotic, and viral genomes available could be classified according to genomes on a large-scale Self-Organizing Map constructed with the Earth Simulator. J. Earth Simulator 6, 17–23 (2006)
Abe, T., Sugawara, H., Kinouchi, M., Kanaya, S., Ikemura, T.: Novel phylogenetic studies of genomic sequence fragments derived from uncultured microbe mixtures in environmental and clinical samples. DNA Res. 12, 281–290 (2005)
Hayashi, H., et al.: Direct cloning of genes encoding novel xylanases from human gut. Can. J. Microbiol. 51, 251–259 (2005)
Uchiyama, T., Abe, T., Ikemura, T., Watanabe, K.: Substrate-induced gene-expression screening of environmental metagenome libraries for isolation of catabolic genes. Nature Biotech. 23, 88–93 (2005)
Abe, T., Sugawara, H., Kanaya, S., Ikemura, T.: A novel bioinformatics tool for phylogenetic classification of genomic sequence fragments derived from mixed genomes of environmental uncultured microbes. Polar Bioscience 20, 103–112 (2006)
Kosaka, T., et al.: The genome of Pelotomaculum thermopropionicum reveals niche-associated evolution in anaerobic microbiota. Genome Res. 18, 442–448 (2008)
Venter, J.C., et al.: Environmental genome shotgun sequencing of the Sargasso Sea. Science 304, 66–74 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Abe, T., Hamano, Y., Kanaya, S., Wada, K., Ikemura, T. (2009). A Large-Scale Genomics Studies Conducted with Batch-Learning SOM Utilizing High-Performance Supercomputers. In: Cabestany, J., Sandoval, F., Prieto, A., Corchado, J.M. (eds) Bio-Inspired Systems: Computational and Ambient Intelligence. IWANN 2009. Lecture Notes in Computer Science, vol 5517. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02478-8_104
Download citation
DOI: https://doi.org/10.1007/978-3-642-02478-8_104
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02477-1
Online ISBN: 978-3-642-02478-8
eBook Packages: Computer ScienceComputer Science (R0)