Skip to main content

A Deep Learning Network for Exploiting Positional Information in Nucleosome Related Sequences

  • Conference paper
  • First Online:
Bioinformatics and Biomedical Engineering (IWBBIO 2017)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 10209))

Included in the following conference series:

Abstract

A nucleosome is a DNA-histone complex, wrapping about 150 pairs of double-stranded DNA. The role of nucleosomes is to pack the DNA into the nucleus of the Eukaryote cells to form the Chromatin. Nucleosome positioning genome wide play an important role in the regulation of cell type-specific gene activities. Several biological studies have shown sequence specificity of nucleosome presence, clearly underlined by the organization of precise nucleotides substrings. Taking into consideration such advances, the identification of nucleosomes on a genomic scale has been successfully performed by DNA sequence features representation and classical supervised classification methods such as Support Vector Machines and Logistic regression. The goal of this work is to propose a classification method for nucleosome positioning that, differently from the proposed method so far, does not make any use of a sequence feature extraction step. Deep neural networks (DNN) or deep learning models, were proved to be able to extract automatically useful features from input patterns. Under this framework, Long Short-Term Memory (LSTM) is a recurrent unit that reads a sequence one step at a time and can exploit long range relations. In this work, we propose a DNN model for nucleosome identification on sequences from three different species. Our experiments show that it outperforms classical methods in two of the three data sets and give promising results also for the other.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Svaren, J., Horz, W.: Transcription factors vs. nucleosomes: regulation of the PHO5 promoter in yeast. Trends Biochem. Sci. 22, 93–97 (1997)

    Article  Google Scholar 

  2. Struhl, K., Segal, E.: Determinants of nucleosome positioning. Nat. Struct. Mol. Biol. 20(3), 267–273 (2013)

    Article  Google Scholar 

  3. Yuan, G.C.: Linking genome to epigenome. Wiley Interdisc. Rev.: Syst. Biol. Med. 4(3), 297–309 (2012)

    Google Scholar 

  4. Pinello, L., Lo Bosco, G., Yuan, G.-C.: Applications of alignment-free methods in epigenomics. Briefings Bioinform. 15(3), 419–430 (2014)

    Article  Google Scholar 

  5. Kuksa, P., Pavlovic, V.: Efficient alignment-free DNA barcode analytics. BMC Bioinform. 10(Suppl. 14), S9 (2009)

    Article  Google Scholar 

  6. Pinello, L., Lo Bosco, G., Hanlon, B., Yuan, G-C.: A motif-independent metric for DNA sequence specificity. BMC Bioinform. 12, Article No. 408 (2011)

    Google Scholar 

  7. Giosué, L.B., Luca, P.: A new feature selection methodology for k-mers representation of DNA sequences. In: Serio, C., Liò, P., Nonis, A., Tagliaferri, R. (eds.) CIBB 2014. LNCS, vol. 8623, pp. 99–108. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24462-4_9

    Chapter  Google Scholar 

  8. Rizzo, R., Fiannaca, A., Rosa, M., Urso, A.: The general regression neural network to classify barcode and mini-barcode DNA. In: Serio, C., Liò, P., Nonis, A., Tagliaferri, R. (eds.) CIBB 2014. LNCS, vol. 8623, pp. 142–155. Springer, Heidelberg (2015). doi:10.1007/978-3-319-24462-4_13

    Chapter  Google Scholar 

  9. Lo Bosco, G.: Alignment free dissimilarities for nucleosome classification. In: Angelini, C., Rancoita, P.M.V., Rovetta, S. (eds.) CIBB 2015. LNCS, vol. 9874, pp. 114–128. Springer, Heidelberg (2016). doi:10.1007/978-3-319-44332-4_9

    Chapter  Google Scholar 

  10. Fiannaca, A., La Rosa, M., Rizzo, R., Urso, A.: Analysis of DNA barcode sequences using neural gas and spectral representation. In: Iliadis, L., Papadopoulos, H., Jayne, C. (eds.) EANN 2013. CCIS, vol. 384, pp. 212–221. Springer, Heidelberg (2013). doi:10.1007/978-3-642-41016-1_23

    Chapter  Google Scholar 

  11. Srivastava, N., Hinton, G.E., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MATH  MathSciNet  Google Scholar 

  12. Hochreiter, S., Bengio, Y., Frasconi, P., Schmidhuber, J.: Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer, S.C., Kolen, J.F. (eds.) A Field Guide to Dynamical Recurrent Neural Networks. IEEE Press, New York (2001)

    Google Scholar 

  13. Fiannaca, A., Rosa, M., Rizzo, R., Urso, A.: A k-mer-based barcode DNA classification methodology based on spectral representation and a neural gas network. Artif. Intell. Med. 64(3), 173–184 (2015)

    Article  MATH  Google Scholar 

  14. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  MATH  MathSciNet  Google Scholar 

  15. Farabet, C., Couprie, C., Najman, L., et al.: Learning hierarchical features for scene labeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(8), 1915–1929 (2013)

    Article  Google Scholar 

  16. Tompson, J.J., Jain, A., LeCun, Y., et al.: Joint training of a convolutional network and a graphical model for human pose estimation. In: Advances in Neural Information Processing Systems, pp. 1799–1807 (2014)

    Google Scholar 

  17. Kiros, R., Zhu, Y., Salakhutdinov, R.R., et al.: Skip-thought vectors. In: Advances in Neural Information Processing Systems, pp. 3276–3284 (2015)

    Google Scholar 

  18. Li, J., Luong, M-T., Jurafsky, D.: A hierarchical neural autoencoder for paragraphs and documents. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 1106–1115 (2015)

    Google Scholar 

  19. Luong, M-T., Pham, H., Manning, C.D.: Effective approaches attention-based neural machine translation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 1412–1421 (2015)

    Google Scholar 

  20. Cho, K., Van Merriënboer, B., Gulcehre, C., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734 (2014)

    Google Scholar 

  21. Chatterjee, R., Farajian, M.A., Conforti, C., Jalalvand, S., Balaraman, V., Di Gangi, M.A., Ataman, D., Turchi, M., Negri, M., Federico, M.: FBK’s neural machine translation systems for IWSLT. In: Proceedings of 13th International Workshop on Spoken Language Translation (IWSLT 2016) (2016)

    Google Scholar 

  22. LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)

    Article  Google Scholar 

  23. Collobert, R., Weston, J., Bottou, L., Karlen, M., Kavukcuoglu, K., Kuksa, P.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)

    MATH  Google Scholar 

  24. Rizzo, R., Fiannaca, A., La Rosa, M., Urso, A.: A deep learning approach to DNA sequence classification. In: Angelini, C., Rancoita, P.M.V., Rovetta, S. (eds.) CIBB 2015. LNCS, vol. 9874, pp. 129–140. Springer, Heidelberg (2016). doi:10.1007/978-3-319-44332-4_10

    Chapter  Google Scholar 

  25. Lo Bosco, G., Di Gangi, M.A.: Deep learning architectures for DNA sequence classification. In: Petrosino, A., Loia, V., Pedrycz, W. (eds.) WILF 2016. LNCS (LNAI), vol. 10147, pp. 162–171. Springer, Cham (2017). doi:10.1007/978-3-319-52962-2_14

    Chapter  Google Scholar 

  26. Lo Bosco, G., Rizzo, R., Fiannaca, A., La Rosa, M., Urso, A.: A deep learning model for epigenomic studies. In: SITIS The 12th International Conference on Signal Image Technology & Internet Systems, pp. 688–692 (2016, to appear)

    Google Scholar 

  27. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  28. Bridle, J.S.: Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Soulié, F.F., Hérault, J. (eds.) Neurocomputing, pp. 227–236. Springer, Heidelberg (1990)

    Chapter  Google Scholar 

  29. Guo, S.-H., Deng, E.-Z., Xu, L.-Q., Ding, H., Lin, H., Chen, W., Chou, K.-C.: iNuc-PseKNC: a sequence-based predictor for predicting nucleosome positioning in genomes with pseudo k-tuple nucleotide composition. Bioinformatics 30(11), 1522–1529 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Giosué Lo Bosco .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Di Gangi, M.A., Gaglio, S., La Bua, C., Lo Bosco, G., Rizzo, R. (2017). A Deep Learning Network for Exploiting Positional Information in Nucleosome Related Sequences. In: Rojas, I., Ortuño, F. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2017. Lecture Notes in Computer Science(), vol 10209. Springer, Cham. https://doi.org/10.1007/978-3-319-56154-7_47

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-56154-7_47

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-56153-0

  • Online ISBN: 978-3-319-56154-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics