Skip to main content

A Numerical Representation Method for a DNA Sequence Using Gray Code Method

  • Conference paper
  • First Online:
Soft Computing for Problem Solving

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1057))

Abstract

The exceptional speed in increase of genomic data at public databases requires advanced computational tools to perform quick gene analysis. The tools can be devised with the aid of genomic signal processing. The pivotal task in genomic signal processing is numerical mapping. In numerical mapping, the string of nucleotides is transformed into discrete numerical sequence by assigning optimum mathematical descriptor to a nucleotide. The descriptor must be compatible with the further stages of genomic application in order to achieve high efficiency. In this work, a simple numerical mapping method is proposed in which the optimum descriptor value is obtained by applying Gray code concept. The proposed method is evaluated on benchmark databases HRM195 and ASP67 for an identification of protein coding region application. The proposed method exhibits improved exon prediction efficiency in terms of performance accuracy and equal error rate when compared with similar methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Vaidyanathan, P.P., Yoon, B.J.: The role of signal-processing concepts in genomics and proteomics. J. Franklin Inst. 341(1–2), 111–135 (2004)

    Article  Google Scholar 

  2. Anastassiou, D.: Genomic signal processing. IEEE Signal Process. Mag. 18, 8–20 (2001)

    Article  Google Scholar 

  3. Akhtar, M., Epps, J., Ambikairajah, E.: On DNA numerical representations for period-3 based exon prediction. In: GENSIPS’07—5th IEEE International Workshop on Genomic Signal Processing and Statistics (2007)

    Google Scholar 

  4. Ahmad, M., Jung, L.T., Bhuiyan, A.A.: A biological inspired fuzzy adaptive window median filter (FAWMF) for enhancing DNA signal processing. Comput. Methods Programs Biomed. 149, 11–17 (2017)

    Article  Google Scholar 

  5. Marhon, S.A., Kremer, S.C.: Prediction of protein coding regions using a wide-range wavelet window method. IEEE/ACM Trans. Comput. Biol. Bioinform. 13(4), 742–753 (2016)

    Article  Google Scholar 

  6. Rao, K.D., Swamy, M.N.S.: Analysis of genomics and proteomics using DSP techniques. IEEE Trans. Circuits Syst. I Regul. Pap. 55(1), 370–378 (2008)

    Article  MathSciNet  Google Scholar 

  7. Yu, N., Li, Z., Yu, Z.: Survey on encoding schemes for genomic data representation and feature learning—from signal processing to machine learning. Big Data Min. Anal. 1(3), 191–210 (2018)

    Google Scholar 

  8. Das, B., Turkoglu, I.: A novel numerical mapping method based on entropy for digitizing DNA sequences. Neural Comput. Appl. 29(8), 207–215 (2018)

    Article  Google Scholar 

  9. Mo, Z., et al.: One novel representation of DNA sequence based on the global and local position information. Sci. Rep. 8(1), 1–7 (2018)

    Article  Google Scholar 

  10. Singha Roy, S., Barman, S.: Polyphase filtering with variable mapping rule in protein coding region prediction. Microsyst. Technol. 23(9), 4111–4121 (2017)

    Article  Google Scholar 

  11. Voss, R.F.: Evolution of long-range fractal correlations and 1/f noise in DNA base sequences. Phys. Rev. Lett. 68(25), 3805–3808 (1992)

    Article  Google Scholar 

  12. Cristea, P.D.: Genetic signal representation and analysis. In: Proc. SPIE Conference on International Symposium on Biomedical Optics (BIOS’02), vol. 4623, pp. 77–84 (2002)

    Google Scholar 

  13. Hebert, P.D.N., Cywinska, A., Ball, S.L., DeWaard, J.R.: Biological identifications through DNA barcodes. In: Proceedings of the Royal Society of London. Series B: Biological Sciences, vol. 270, no. 1512, pp. 313–321 (2003)

    Article  Google Scholar 

  14. Rosen, G.L.: Biologically-inspired gradient source localization and DNA sequence analysis. Georg. Inst. Technol., August, 2006

    Google Scholar 

  15. Chakravarthy, N., Spanias, A., Iasemidis, L.D., Tsakalis, K.: Autoregressive modeling and feature analysis of DNA sequences. EURASIP J. Appl. Signal Process. 1, 13–28 (2004)

    Google Scholar 

  16. Rosen, G.L., Moore, J.D.: Investigation of coding structure in DNA. In: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’03), 6 April 2003

    Google Scholar 

  17. Cristea, P.D.: Conversion of nucleotides sequences into genomic signals. J. Cell. Mol. Med. 6(2), 279–303 (2002)

    Article  MathSciNet  Google Scholar 

  18. Lucal, H.M.: Arithmetic operations for digital computers using a modified reflected binary code. IRE Trans. Electron. Comput. EC-8(4), 449–458 (1959)

    Article  Google Scholar 

  19. HRM195 and ASP67dataset. http://www.vision.ime.usp.br/jmena/MGWT/datasets/2010

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vaegae Naveen Kumar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Raman Kumar, M., Naveen Kumar, V. (2020). A Numerical Representation Method for a DNA Sequence Using Gray Code Method. In: Das, K., Bansal, J., Deep, K., Nagar, A., Pathipooranam, P., Naidu, R. (eds) Soft Computing for Problem Solving. Advances in Intelligent Systems and Computing, vol 1057. Springer, Singapore. https://doi.org/10.1007/978-981-15-0184-5_55

Download citation

Publish with us

Policies and ethics