Abstract
Current microarray technology provides ways to obtain time series expression data for studying a wide range of biological systems. However, the expression data tends to contain considerable noise, which as a result may deteriorate the clustering quality. We propose a web-knowledge-based clustering method to incorporate the knowledge of gene-gene relations into the clustering procedure. Our method first obtains the biological roles of each gene through a web mining process, next groups genes based on their biological roles and the Gene Ontology, and last applies a semi-supervised clustering model where the supervision is provided by the detected gene groups. Under the guidance of the knowledge, the clustering procedure is able to cope with data noise. We evaluate our method on a publicly available data set of human fibroblast response to serum. The experimental results demonstrate improved quality of clustering compared to the clustering methods without any prior knowledge.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
B. Adryan and R. Schuh. Gene-ontology-based clustering of gene expression data. Bioinforrnatics, 20(16).
S. Basu, M. Bilenko, and R. J. Mooney. A probabilistic framework for semi-supervised clustering. In the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 59–68, Seattle, WA, August 2004.
J. Cheng, J. Martin, M. Cline, T. Awad, and M. A. Siani-Rose. Gene expression profiling analysis augmented by mathematically transformed gene ontology. In International Conference on Intelligent Systems in Molecular Biology ISMB 2002, August 2002.
J.-H. Chiang and H.-C. Yu. Meke: discovering the functions of gene products from biomedical literature via sentence alignment. Bioinfomatics, 19(11):1417–1422, 2003.
M. B. Eisen, P. T. Spellman, P. O. Brown, and D. Botstein. Cluster analysis and display of genome-wide expression patterns. Proceedings of Natural Academy Science, 95(25):14863–14868, December 1998.
D. Hanisch, A. Zien, R. Zimmer, and T. Lengauer. Co-clustering of biological networks and gene expression data. Bioinforrnatics, 18(Suppl 1).
V. R. Iyer, M. B. Eisen, D. T. Ross, G. Schuler, T. Moore, J. C. F. Lee, J. M. Trent, L. M. Staudt, J. H. Jr., M. S. Boguski, and et al. The transcriptional program in the response of human fibroblasts to serum. Science, 283(1):83–87, January 1999.
A. Lasgreid, T. R. Hvidsten, H. Midelfart, J. Komorowski, and A. K. Sandvik. Predicting gene ontology biological process from temporal gene expression patterns. Genome Res, 13:965–979, 2003.
G. J. Nau, J. F. L. Richmond, A. Schlesinger, E. G. Jennings, E. S. Lander, and R. A. Young. Human macrophage activation programs induced by bacterial pathogens. Proceedings of Natural Academy of Sciences of the U. S. A., 99(3): 1503–1508, February 2002.
A. Schliep, A. Schonhuth, and C. Steinhoff. Using hidden markov models to analyze gene expression time course data. Bioinforrnatics, 19:I264–I272, 2003.
S. Soderland. Learning information extraction rules for semi-structured and free text. Machine Learning, 34(1–3):233–272, 1999.
S. Tavazoie, J. Hughes, M. Campbell, R. Cho, and G. Church. Systematic determination of genetic network architecture, 1999.
M. L. Whitfield, G. Sherlock, A. J. Saldanha, J. I. Murray, C. A. Ball, K. E. Alexander, J. C. Matese, C. M. Perou, M. M. Hurt, P. O. Brown, and D. Botstein. Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Molecular Biology of the Cell, 13(6):1977–2000, June 2002.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Tang, N., Rao Vemuri, V. (2006). A Web-knowledge-based Clustering Model for Gene Expression Data Analysis. In: Last, M., Szczepaniak, P.S., Volkovich, Z., Kandel, A. (eds) Advances in Web Intelligence and Data Mining. Studies in Computational Intelligence, vol 23. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-33880-2_24
Download citation
DOI: https://doi.org/10.1007/3-540-33880-2_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-33879-6
Online ISBN: 978-3-540-33880-2
eBook Packages: EngineeringEngineering (R0)