Abstract
Here we consider the problem of mining gene expression data in order to single out interesting features characterizing healthy/ unhealthy samples of an input dataset. The presented approach is based on a network model of the input gene expression data, where there is a labeled graph for each sample. This is the first attempt to build a different graph for each sample and, then, to have a database of graphs for representing a sample set. The main goal is that of singling out interesting differences between healthy and unhealthy samples, through the extraction of discriminative patterns among graphs belonging to the two different sample sets. Differently from the other approaches presented in the literature, this technique is able to take into account important local similarities, and also collaborative effects involving interactions between multiple genes. In particular, edge-labeled graphs are employed and the discriminative power of a pattern is measured on the basis of edge weights, which are representative of how much relevant is the co-expression between two genes.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Since there is a one-to-one correspondence between an individual and its representing tuple, for the sake of simplicity, we employ the same symbol t to denote both the individual and its corresponding tuple in the dataset.
- 2.
The reader is referred to Sect. 4.4.2 for the details.
- 3.
References
Allison, D.B., Cui, X., Page, G.P., Sabripour, M.: Microarray data analysis: from disarray to consolidation and consensus. Nat. Rev. Genet. 7(1), 55–65 (2006)
Anastassiou, D.: Computational analysis of the synergy among multiple interacting genes. Mol. Syst. Biol. 3(1), 83 (2007)
Atias, N., Sharan, R.: Comparative analysis of protein networks: hard problems, practical solutions. Commun. ACM 55(5), 88–97 (2012)
Dehmer, M., Emmert-Streib, F., Graber, A., Salvador, A.: Applied statistics for network biology: methods in systems biology. John Wiley & Sons (2011)
Emmert-Streib, F., Tripathi, S., de Matos Simoes, R.: Harnessing the complexity of gene expression data from cancer: from single gene to structural pathway methods. Biol. Direct 7(44.10), 1186 (2012)
Gray, R.M.: Entropy and information theory. Springer Science & Business Media (2011)
Metzker, M.L.: Sequencing technologies-the next generation. Nat. Rev. Genet. 11(1), 31–46 (2010)
Mitchell, T.M.: Machine Learning, vol. 45. Burr Ridge, IL: McGraw Hill (1997)
Panni, S., Rombo, S.E.: Searching for repetitions in biological networks: methods, resources and tools. Brief. Bioinform. 16(1), 118–136 (2015)
Quackenbush, J.: Computational analysis of microarray data. Nat. Revi. Genet. 2(6), 418–427 (2001)
Roy, S., Bhattacharyya, D.K., Kalita, J.K.: Reconstruction of gene co-expression network from microarray data using local expression patterns. BMC Bioinform. 15(Suppl 7), S10 (2014)
Rung, J., Brazma, A.: Reuse of public genome-wide gene expression data. Nat. Rev. Genet. 14, 89–99 (2013)
Vidal, M., Cusick, M.E., Barabasi, A.L.: Interactome networks and human disease. Cell 144(6), 986–998 (2011)
Wang, Z., Gerstein, M., Snyder, M.: Rna-seq: a revolutionary tool for transcriptomics. Nat. Rev. Genet. 10(1), 57–63 (2009)
Watkinson, J., Wang, X., Zheng, T., Anastassiou, D.: Identification of gene interactions associated with disease from gene expression data using synergy networks. BMC Syst. Biol. 2(1), 10 (2008)
Yan, X., Cheng, H., Han, J., Yu, P.S.: Mining significant graph patterns by leap search. In: ACM SIGMOD International Conference on Management of data, pp. 433–444. ACM (2008)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2017 The Author(s)
About this chapter
Cite this chapter
Fassetti, F., Rombo, S.E., Serrao, C. (2017). Discriminating Graph Pattern Mining from Gene Expression Data. In: Discriminative Pattern Discovery on Biological Networks. SpringerBriefs in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-319-63477-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-319-63477-7_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63476-0
Online ISBN: 978-3-319-63477-7
eBook Packages: Computer ScienceComputer Science (R0)