Abstract
This paper proposes a novel ranking function, called RFHOS by incorporating higher order cumulants into the ranking function for finding differentially expressed genes. Traditional ranking functions assume a data distribution (e.g., Normal) and use only first two cumulants for statistical significance analysis. Ranking functions based on second order statistics are often inadequate in ranking small sampled data (e.g., Microarray data). Also, relatively small number of samples in the data makes it hard to estimate the parameters accurately causing inaccuracies in ranking of the genes. The proposed ranking function is based on higher order statistics (RFHOS) that account for both the amplitude and the phase information by incorporating the HOS. The incorporation of HOS deviates from implicit symmetry assumed for Gaussian distribution. In this paper the performance of the RFHOS is compared against other well known ranking functions designed for ranking the genes in two sample microarray experiments.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Stephen, P.A.: Affymetrix, Santa Clara, California (1992-2007), http://www.affymetrix.com/index.affx
Hewlett, B., Packard, D.: Agilent Technologies, Santa Clara, California (1999-2007), http://www.home.agilent.com/agilent/home.jspx
Guyon, I.: An Introduction of Variable and Feature Selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Ray, J.M., Hearl, W.G.: Methods for Evaluating Differential Gene Expression in Tissues and Cells. In: Drug Development, pp. 50–55 (2005)
Shaik, J., Yeasin, M.: A Progressive Framework for Two-Way Clustering Using Adaptive Subspace Iteration for Functionally Classifying Genes. In: Proceedings of IEEE IJCNN’06, Vancouver, Canada, pp. 5287–5292 (2006)
Shaik, J., Yeasin, M.: Performance Evaluation of Subspace-based Algorithm in Selecting differentially Expressed Genes and Classification of Tissue Types from Microarray Data. In: Proceedings of IEEE IJCNN’06, Vancouver, Canada, pp. 5279–5286 (2006)
Brody, J.P., et al.: Significance and Statistical Errors in the Analysis of DNa microarray Data. Proc. Natl. Acad. Sci. 99, 12975–12978 (2002)
Chen, Y., Dougherty, E.R., Bittner, M.L.: Ratio based decisions and quantitative analysis of cDNA microarray images. Journal of Biomedical optics 2, 364–374 (1997)
Huber, W., et al.: Variance Stabilization Applied to Microarray Data Calibration and to Quantification of Differential Expression. Bioinformatics 18, s96–104 (2002)
Konishi, T.: Three Parameter Lognormal Distribution Ubiquitously Found in cDNA Microarray data and Its Application to Parametric Data Treatment. Bioinformatics 5 (2004)
Lonnstedt, I., Speed, T.: Replicated Microarray Data. Statistica Sinica 12, 31–46 (2002)
Purdom, E., Holmes, S.: Error Distribution for Gene Expression Data. Statistical Applications in Genetics and Molecular Biology 4 (2005)
Rocke, D.M., Durbin, B.: Approximate Variance-stabilizing Transformations for Gene Expression Microarray Data. Bioinformatics 19, 966–972 (2003)
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. John Wiley and Sons Inc., Chichester (2000)
Getz, G., Levine, E., Domany, E.: Coupled two-way clustering of gene microarray data. Proceedings of National Academy of Science, USA 97, 12079–12084 (2000)
Mukherjee, S., Roberts, S.J., Laan, M.J.: Data-adaptive Test Statistics for Microarray Data. Bioinformatics 21, 108–114 (2005)
Shaik, J., Yeasin, M.: Adaptive Ranking and Selection of Differentially Expressed Genes from Microarray Data. WSEAS transactions on Biology and Biomedicine 3, 125–133 (2006)
Pan, W.: A Comparative Review of Statistical Methods for Discovering Differentially Expressed Genes in Replicated Microarray Experiments. Bioinformatics 18, 546–554 (2002)
Jeffery, I.B., Higgins, D.G., Culhane, A.C.: Comparison and Evaluation of Methods for Generating Differentially Expressed Gene lists from MicroArray Data. BMC Bioinformatics 7, 359–375 (2006)
Mutch, D.M., et al.: The Limit Fold Change Model: A Practical Approach for Selecting Differentially Expressed Genes from Microarray Data. BMC Bioinformatics 21, 3–17 (2002)
Sahai, H., Ojeda, M.M.: Analysis of Variance for Random Models: Theory, Methods, Applications and Data Analysis. Birkhäuser, Basel (2004)
Casella, G., Berger, R.L.: Statistical Inference, 2nd edn. Duxbury Press, Belmont (2001)
Thomas, J.G., et al.: An Efficient and Robust Statistical Modeling Approach to Discover Differentially Expressed Genes using Genomic Expression Profiles. Genome Research 11, 1227–1236 (2001)
Tusher, V.G., Tibshirani, R., Chu, G.: Significance Analysis of Microarrays Applied to The Ionizing Radiation Response. PNAS 98, 5116–5121 (2001)
Papoulis, A., Pillai, S.U.: Probability, Random Variables and Stochastic Processes, 4th edn. Tata McGraw Hill, New Delhi (2002)
Hyvarinen, A., Oja, E.: Independent Component Analysis: Algorithms and Applications. Neural Networks 13, 411–430 (2000)
Stekel, D.: Microarray Bioinformatics, 1st edn. Cambridge University Press, Cambridge (2003)
Chen, X., et al.: Variation in Gene Expression Patterns in Human Gastric Cancers. Mol. Bio. Cell. 14, 3208–3215 (2003)
Golub, T.R., et al.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)
Alon, U., et al.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Natl. Acad. Sci. USA 96, 6745–6750 (1999)
Shaik, J., Yeasin, M.: Visualization of High Dimensional Data using an Automated 3D Star Co-ordinate System. In: Proceedings of IEEE IJCNN’06, Vancouver, Canada, pp. 2318–2325 (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Shaik, J., Yeasin, M. (2007). Ranking Function Based on Higher Order Statistics (RF-HOS) for Two-Sample Microarray Experiments. In: Măndoiu, I., Zelikovsky, A. (eds) Bioinformatics Research and Applications. ISBRA 2007. Lecture Notes in Computer Science(), vol 4463. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72031-7_9
Download citation
DOI: https://doi.org/10.1007/978-3-540-72031-7_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72030-0
Online ISBN: 978-3-540-72031-7
eBook Packages: Computer ScienceComputer Science (R0)