Skip to main content

MultiNNProm: A Multi-Classifier System for Finding Genes

  • Conference paper
Applied Soft Computing Technologies: The Challenge of Complexity

Part of the book series: Advances in Soft Computing ((AINSC,volume 34))

  • 1191 Accesses

Abstract

The computational identification of genes in DNA sequences has become an issue of crucial importance due to the large number of DNA molecules being currently sequenced. We present a novel neural network based multi-classifier system, MultiNNProm, for the identification of promoter regions in E.Coli1 DNA sequences. The DNA sequences were encoded using four different encoding methods and were used to train four different neural networks. The classification results of these neural networks were then aggregated using a variation of the LOP method. The aggregating weights used within the modified LOP aggregating algorithm were obtained through a genetic algorithm. We show that the use of different neural networks, trained on the same set of data, could provide slightly varying results if the data were differently encoded. We also show that the combination of more neural classifiers provides us with better accuracy than the individual networks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]Baldi P., Brunak S, “Bioinformatics – The Machine Learning Approach”, MIT Press, Cambridge MA, 1998.

    Google Scholar 

  2. Birney, E. “Hidden Markov Models in Biological Sequence Analysis”. IBM Journal of Research and Development Volume 45, Numbers ¾, 2001.

    Google Scholar 

  3. Hansen J.V., Krogh A., “A general method for combining in predictors tested on protein secondary structure prediction”, citeseer.ist.psu.edu/324992.html.

    Google Scholar 

  4. Henderson, J., Salzberg, S. and Fasman, K. “Finding Genes in DNA with a Hidden Markov Model”. Journal of Computational Biology, Vol. 4, No. 2 (1997), pp. 127–141.

    Google Scholar 

  5. Koza J.R, Andre D., “Automatic Discovery of Protein Motifs Using Genetic Programming”, Evolutionary Computation: Theory and Applications, 1995.

    Google Scholar 

  6. Krogh, A. “Two Methods for Improving Performance of a HMM and Their Application for Gene Finding”. Proceedings of the Fifth International Conference on Intelligent Systems for Molecular Biology, AAAI Press, Menlo Park, CA, 1997, pp. 179–186.

    Google Scholar 

  7. Kulp, D., Haussler, D., Reese, M. G. and Eeckman, F. H. Ä Generalized Hidden Markov Model for the Recognition of Human Genes in DNA". Proceedings of the Fourth International Conference on Intelligent Systems for Molecular Biology, AAAI Press, Menlo Park, CA, 1996, pp. 134–142.

    Google Scholar 

  8. Ma Q., Wang J.T.L., Wu C.H., “Application of Bayesian Neural Networks to Biological Data Mining: A Case Study in DNA Sequence Classification”, citeseer.ist.psu.edu/314880.html.

    Google Scholar 

  9. Mahadevan I., Ghosh I., “Analysis of E.Coli promoter structures using neural networks, Nucleic Acids Research, Vol 22, Issue 11 2158–2165, 1994.

    Google Scholar 

  10. Ohno-Machado L., Vinterbo S., Webber G., “Classification of Gene Expression Data Using Fuzzy Logic”, Decision Systems Group.

    Google Scholar 

  11. Riis S.K., Krogh A., “Improving prediction of protein secondary structure using neural networks and multiple sequence alignments”, Journal of Computational Biology, 3:163–183, 1996.

    Article  Google Scholar 

  12. Rogova G., “Combining the results of several neural network classifiers”, Neural Networks, 7(5):777–781, 1994.

    Article  Google Scholar 

  13. Rost B., Sander C., “Prediction of protein secondary structure at better than 70% accuracy”, Journal of Molecular Biology, 232(2):584–599, Jul 20, 1993.

    Article  Google Scholar 

  14. Salzberg S., Delcher A.L., Fasman K.H., Henderson J., “A Decision Tree System for Finding Genes in DNA”, Journal of Computational Biology, 1997.

    Google Scholar 

  15. Sharkey A.C.J., Sharkey N.E., “Combining diverse neural networks”, The Knowledge Engineering Review, 12(3):231–247, 1997.

    Article  Google Scholar 

  16. Snyder E.E., Stormo G., “Identification of Protein Coding Regions in Genomic DNA”, Journal of Molecular Biology (1995) 248, 1–18.

    Article  Google Scholar 

  17. Uberbacher E.C., Mural R. J., “Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach”, Proc. Natl. Acad. Sci. USA, Vol 88, 11261–11265, 1991.

    Article  Google Scholar 

  18. Woolf P.J., Wang Y., “A Fuzzy Logic Approach to Analysing Gene Expression Data”, Physiol Genomics, 3: 9–15, 2000.

    Google Scholar 

  19. Zenobi G., Cuningham P., “Using diversity in preparing ensembles of classifiers based on different feature subsets to minimize generalization error”, in proceedings of the 12th European Conference on Machine Learning, pages 576–587, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer

About this paper

Cite this paper

Ranawana, R., Palade, V. (2006). MultiNNProm: A Multi-Classifier System for Finding Genes. In: Abraham, A., de Baets, B., Köppen, M., Nickolay, B. (eds) Applied Soft Computing Technologies: The Challenge of Complexity. Advances in Soft Computing, vol 34. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-31662-0_35

Download citation

  • DOI: https://doi.org/10.1007/3-540-31662-0_35

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-31649-7

  • Online ISBN: 978-3-540-31662-6

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics