Skip to main content

Deep Learning Framework for Cyber Threat Situational Awareness Based on Email and URL Data Analysis

  • Chapter
  • First Online:
Cybersecurity and Secure Information Systems

Abstract

Spamming and Phishing attacks are the most common security challenges we face in today’s cyber world. The existing methods for the Spam and Phishing detection are based on blacklisting and heuristics technique. These methods require human intervention to update if any new Spam and Phishing activity occurs. Moreover, these are completely inefficient in detecting new Spam and Phishing activities. These techniques can detect malicious activity only after the attack has occurred. Machine learning has the capability to detect new Spam and Phishing activities. This requires extensive domain knowledge for feature learning and feature representation. Deep learning is a method of machine learning which has the capability to extract optimal feature representation from various samples of benign, Spam and Phishing activities by itself. To leverage, this work uses various deep learning architectures for both Spam and Phishing detection with electronic mail (Email) and uniform resource locator (URL) data sources. Because in recent years both Email and URL resources are the most commonly used by the attackers to spread malware. Various datasets are used for conducting experiments with deep learning architectures. For comparative study, classical machine learning algorithms are used. These datasets are collected using public and private data sources. All experiments are run till 1,000 epochs with varied learning rate 0.01–0.5. For comparative study various classical machine learning classifiers are used with domain level feature extraction. For deep learning architectures and classical machine learning algorithms to convert text data into numeric representation various natural language processing text representation methods are used. As far as anyone is concerned, this is the first attempt, a framework that can examine and connect the occasions of Spam and Phishing activities from Email and URL sources at scale to give cyber threat situational awareness. The created framework is exceptionally versatile and fit for distinguishing the malicious activities in close constant. In addition, the framework can be effectively reached out to deal with vast volume of other cyber security events by including extra resources. These qualities have made the proposed framework emerge from some other arrangement of comparative kind.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://theconversation.com/four-email-problems-that-even-titans-of-tech-havent-resolved-37389.

  2. 2.

    https://digitalguardian.com/blog/2017-data-breach-report-finds-phishing-emailattacks-still-potent.

  3. 3.

    http://www.aueb.gr/users/ion/data/lingspam_public.tar.gz.

  4. 4.

    http://www.aueb.gr/users/ion/data/PU123ACorpora.tar.gz.

  5. 5.

    www.csmining.org/.

  6. 6.

    https://plg.uwaterloo.ca/~gvcormac/spam/.

  7. 7.

    http://www.cs.bgu.ac.il/~elhadad/nlp16.html.

  8. 8.

    http://www.malwaredomains.com/.

  9. 9.

    https://www.malwaredomainlist.com/.

  10. 10.

    http://www.joewein.de/sw/blacklist.htm.

  11. 11.

    https://www.malwareurl.com/.

  12. 12.

    https://www.phishtank.com/.

  13. 13.

    https://openphish.com/.

  14. 14.

    https://www.alexa.com/siteinfo.

  15. 15.

    http://www.dmoz.org/.

  16. 16.

    https://github.com/rlilojr/Detecting-Malicious-URL-Machine-Learning.

  17. 17.

    https://spark.apache.org/.

  18. 18.

    https://scikit-learn.org/.

  19. 19.

    https://www.tensorflow.org/.

  20. 20.

    https://keras.io/.

References

  1. Cormack GV (2008) Email spam filtering: a systematic review. Found Trends Inf Retr 1(4):335–455

    Article  Google Scholar 

  2. Bhowmick A, Hazarika SM (2016) Machine learning for E-mail spam filtering: review, techniques and trends. arXiv preprint arXiv:1606.01042

  3. Almomani A, Gupta BB, Atawneh S, Meulenberg A, Almomani E (2013) A survey of phishing email filtering techniques. IEEE Commun Surv & Tutor 15(4):2070–2090

    Article  Google Scholar 

  4. Rao H, Shi X, Rodrigue AK, Feng J, Xia Y, Elhoseny M, Gu L (2019) Feature selection based on artificial bee colony and gradient boosting decision tree. Appl Soft Comput 74:634–642

    Article  Google Scholar 

  5. Abdelaziz A, Elhoseny M, Salama AS, Riad AM (2018) A machine learning model for improving healthcare services on cloud computing environment. Measurement 119:117–128

    Article  Google Scholar 

  6. Vinayakumar R, Poornachandran P, Soman KP (2018) Scalable framework for cyber threat situational awareness based on domain name systems data analysis. In: Big data in engineering applications. Springer, Singapore, pp 113–142

    Google Scholar 

  7. Mohan VS, Vinayakumar R, Soman KP, Poornachandran P (2018) Spoof net: syntactic patterns for identification of ominous online factors. In: 2018 IEEE security and privacy workshops (SPW). IEEE, New York, pp 258–263

    Google Scholar 

  8. Vinayakumar R, Soman KP, Poornachandran P (2018) Detecting malicious domain names using deep learning approaches at scale. J Intell & Fuzzy Syst 34(3):1355–1367

    Article  Google Scholar 

  9. Vinayakumar R, Soman KP, Poornachandran P, Mohan VS, Kumar AD (2019) ScaleNet: scalable and hybrid framework for cyber threat situational awareness based on DNS, URL, and Email data analysis. J Cyber Secur Mobil 8(2):189–240

    Article  Google Scholar 

  10. Vinayakumar R, Soman KP, Poornachandran P (2018) Evaluating deep learning approaches to characterize and classify malicious URLs. J Intell & Fuzzy Syst 34(3):1333–1343

    Article  Google Scholar 

  11. Harikrishnan NB, Vinayakumar R, Soman KP, A machine learning approach towards phishing Email detection. In: CEN-Security@IWSPA 2018, pp 22–29. http://ceur-ws.org/Vol-2124/paper7

  12. Vinayakumar R, Barathi Ganesh HB, Anand Kumar M, Soman KP, DeepAnti-PhishNet: applying deep neural networks for phishing email detection. In: CEN-AISecurity@IWSPA-2018, pp 40–50. http://ceur-ws.org/Vol-2124/paper9

  13. Barathi Ganesh HB, Vinayakumar R, Soman KP, Anand Kumar M, Distributed representation using target classes: bag of tricks for security and privacy analytics. In: Amrita-NLP@IWSPA 2018, pp 11–16. http://ceur-ws.org/Vol-2124/paper10

  14. Vazhayil A, Harikrishnan NB, Vinayakumar R, Soman KP, PED-ML: Phishing email detection using classical machine learning techniques. In: CENSec@Amrita, pp 70–77. http://ceur-ws.org/Vol-2124/paper11

  15. Unnithan NA, Harikrishnan NB, Akarsh S, Vinayakumar R, Soman KP, Machine learning based phishing e-mail detection. In: Security-CEN@Amrita, pp 65–69. http://ceur-ws.org/Vol-2124/paper12

  16. Moha VS, Naveen JR, Vinayakumar R, Soman KP, A.R.E.S : Automatic rogue email spotter crypt coyotes, pp 58–64. http://ceur-ws.org/Vol-2124/paper13

  17. Hiransha M, Unnithan NA, Vinayakumar R, Soman KP, Deep learning based phishing E-mail detection CEN-Deepspam, pp 17–21. http://ceur-ws.org/Vol-2124/paper16

  18. Unnithan NA, Harikrishnan NB, Vinayakumar R, Soman KP, Detecting phishing E-mail using machine learning techniques. In: CEN-SecureNLP, pp 51–57. http://ceur-ws.org/Vol-2124/paper17

  19. Vinayakumar R, Soman KP, Poornachandran P (2017) Applying convolutional neural network for network intrusion detection. In: 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, New York, pp 1222–1228

    Google Scholar 

  20. Vinayakumar R, Soman KP, Poornachandran P (2017) Evaluating effectiveness of shallow and deep networks to intrusion detection system. In: 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, New York, pp 1282–1289

    Google Scholar 

  21. Vinayakumar R, Soman KP, Poornachandran P (2017) Evaluation of recurrent neural network and its variants for intrusion detection system (IDS). Int J Inf Syst Model Des (IJISMD) 8(3):43–63

    Article  Google Scholar 

  22. Vinayakumar R, Soman KP, Poornachandran P (2017) Applying deep learning approaches for network traffic prediction. In: 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, New York, pp 2353–2358

    Google Scholar 

  23. Vinayakumar R, Soman KP, Poornachandran P (2017) Secure shell (ssh) traffic analysis with flow based features using shallow and deep networks. In: 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, New York, pp 2026–2032

    Google Scholar 

  24. Vinayakumar R, Soman KP, Poornachandran P (2017) Evaluating shallow and deep networks for secure shell (ssh) traffic analysis. In: 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, New York, pp 266–274

    Google Scholar 

  25. Vinayakumar R, Soman KP (2018) DeepMalNet: evaluating shallow and deep networks for static PE malware detection. ICT Express

    Google Scholar 

  26. Vinayakumar R, Soman KP, Poornachandran P (2017) Deep android malware detection and classification. In: 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, New York, pp 1677–1683

    Google Scholar 

  27. Elhoseny H, Elhoseny M, Riad AM, Hassanien AE (2018) A framework for big data analysis in smart cities. In: International conference on advanced machine learning technologies and applications. Springer, Cham, pp 405–414

    Chapter  Google Scholar 

  28. Clark J, Koprinska I, Poon J (2003). A neural network based approach to automated e-mail classification. In: IEEE/WIC international conference on web intelligence, 2003. WI 2003. Proceedings. IEEE, New York, pp 702–705

    Google Scholar 

  29. Ruan G, Tan Y (2010) A three-layer back-propagation neural network for spam detection using artificial immune concentration. Soft Comput 14(2):139–150

    Article  Google Scholar 

  30. Lennan C, Naber B, Reher J, Weber L, End-to-end spam classification with neural networks

    Google Scholar 

  31. Eugene L, Caswell I, Making a manageable email experience with deep learning

    Google Scholar 

  32. Bluszcz J, Fitisova D, Hamann A, Trifonov A (2016) Application of support vector machine algorithm in e-mail spam filtering (Patrick J’ahnichen, Preprint submitted to Patrick J’anichen, Advisor)

    Google Scholar 

  33. Mbah KF, Lashkari AH, Ghorbani AA (2017) A phishing email detection approach using machine learning techniques. World Acad Sci Eng Technol Int J Comput Inf Eng 4(1)

    Google Scholar 

  34. Hamid IRA, Abawajy J, Kim TH (2013) Using feature selection and classification scheme for automating phishing email detection. Stud Inform Control 22(1):61–70

    Article  Google Scholar 

  35. Yasin A, Abuhasan A (2016) An intelligent classification model for phishing email detection. arXiv preprint arXiv:1608.02196

  36. Rashwan MA, Al Sallab AA (2012) E-mail classification using deep networks. J Theor Appl Inf 37(2):241–251

    Google Scholar 

  37. Hassanpour R, Dogdu E, Choupani R, Goker O, Nazli N (2018) Phishing E-mail detection by using deep learning algorithms. In: Proceedings of the ACMSE 2018 Conference. ACM, New York, p 45

    Google Scholar 

  38. Rawal S, Rawal B, Shaheen A, Malik S, Phishing detection in E-mails using machine learning

    Google Scholar 

  39. Smadi S, Aslam N, Zhang L, Alasem R, Hossain MA (2015) Detection of phishing emails using data mining algorithms. In: 2015 9th international conference on software, knowledge, information management and applications (SKIMA). IEEE, New York, pp 1–8

    Google Scholar 

  40. Zhang N, Yuan Y (2012) Phishing detection using neural network. CS229 lecture notes

    Google Scholar 

  41. Sananse BE, Sarode TK (2015) Phishing URL detection: a machine learning and web mining-based approach. Int J Comput Appl 123(13)

    Google Scholar 

  42. Varshney G, Misra M, Atrey PK (2016) A survey and classification of web phishing detection schemes. Secur Commun Netw 9(18):6266–6284

    Article  Google Scholar 

  43. Abdi FD, Wenjuan L Malicious URL detection using convolutional neural network

    Google Scholar 

  44. Sahoo D, Liu C, Hoi SC (2017) Malicious URL detection using machine learning: a survey. arXiv preprint arXiv:1701.07179

  45. Feroz MN (2015) Examination of data, and detection of phishing URLs using URL ranking (Doctoral dissertation)

    Google Scholar 

  46. Bahnsen AC, Bohorquez EC, Villegas S, Vargas J, Gonzlez FA (2017) Classifying phishing URLs using recurrent neural networks. In: 2017 APWG symposium on electronic crime research (eCrime). IEEE, New York, pp 1–8

    Google Scholar 

  47. Le H, Pham Q, Sahoo D, Hoi SC (2018) URLNet: learning a URL representation with deep learning for malicious URL detection. arXiv preprint arXiv:1802.03162

  48. Ketari LM, Chandra M, Khanum MA (2012) A study of image spam filtering techniques. In: 2012 fourth international conference on computational intelligence and communication networks (CICN). IEEE, New York, pp 245–250

    Google Scholar 

  49. Bekkerman R (2004) Automatic categorization of email into folders: benchmark experiments on Enron and SRI corpora

    Google Scholar 

  50. Yang J, Park SY (2002) Email categorization using fast machine learning algorithms. In: International conference on discovery science. Springer, Berlin, Heidelberg, pp 316–323

    Chapter  Google Scholar 

  51. Mock K (2001) An experimental framework for email categorization and management. In: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, New York, pp 392–393

    Google Scholar 

  52. Islam MR, Zhou W (2007) Email categorization using multi-stage classification technique. In: Eighth international conference on parallel and distributed computing, applications and technologies, 2007. PDCAT’07. IEEE, New York, pp 51–58

    Google Scholar 

  53. LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported in part by Paramount Computer Systems and Lakhshya Cyber Security Labs. We are grateful to NVIDIA India, for the GPU hardware support to research grant. We are also grateful to Computational Engineering and Networking (CEN) department for encouraging the research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. Vinayakumar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Vinayakumar, R., Soman, K.P., Prabaharan Poornachandran, Akarsh, S., Elhoseny, M. (2019). Deep Learning Framework for Cyber Threat Situational Awareness Based on Email and URL Data Analysis. In: Hassanien, A., Elhoseny, M. (eds) Cybersecurity and Secure Information Systems. Advanced Sciences and Technologies for Security Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-16837-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-16837-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-16836-0

  • Online ISBN: 978-3-030-16837-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics