Deep Learning Framework for Cyber Threat Situational Awareness Based on Email and URL Data Analysis

Vinayakumar, R.; Soman, K. P.; Prabaharan Poornachandran; Akarsh, S.; Elhoseny, Mohamed

doi:10.1007/978-3-030-16837-7_6

R. Vinayakumar¹²,
K. P. Soman¹²,
Prabaharan Poornachandran¹³,
S. Akarsh¹² &
…
Mohamed Elhoseny¹⁴

Part of the book series: Advanced Sciences and Technologies for Security Applications ((ASTSA))

1585 Accesses
18 Citations

Abstract

Spamming and Phishing attacks are the most common security challenges we face in today’s cyber world. The existing methods for the Spam and Phishing detection are based on blacklisting and heuristics technique. These methods require human intervention to update if any new Spam and Phishing activity occurs. Moreover, these are completely inefficient in detecting new Spam and Phishing activities. These techniques can detect malicious activity only after the attack has occurred. Machine learning has the capability to detect new Spam and Phishing activities. This requires extensive domain knowledge for feature learning and feature representation. Deep learning is a method of machine learning which has the capability to extract optimal feature representation from various samples of benign, Spam and Phishing activities by itself. To leverage, this work uses various deep learning architectures for both Spam and Phishing detection with electronic mail (Email) and uniform resource locator (URL) data sources. Because in recent years both Email and URL resources are the most commonly used by the attackers to spread malware. Various datasets are used for conducting experiments with deep learning architectures. For comparative study, classical machine learning algorithms are used. These datasets are collected using public and private data sources. All experiments are run till 1,000 epochs with varied learning rate 0.01–0.5. For comparative study various classical machine learning classifiers are used with domain level feature extraction. For deep learning architectures and classical machine learning algorithms to convert text data into numeric representation various natural language processing text representation methods are used. As far as anyone is concerned, this is the first attempt, a framework that can examine and connect the occasions of Spam and Phishing activities from Email and URL sources at scale to give cyber threat situational awareness. The created framework is exceptionally versatile and fit for distinguishing the malicious activities in close constant. In addition, the framework can be effectively reached out to deal with vast volume of other cyber security events by including extra resources. These qualities have made the proposed framework emerge from some other arrangement of comparative kind.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Cormack GV (2008) Email spam filtering: a systematic review. Found Trends Inf Retr 1(4):335–455
Article Google Scholar
Bhowmick A, Hazarika SM (2016) Machine learning for E-mail spam filtering: review, techniques and trends. arXiv preprint arXiv:1606.01042
Almomani A, Gupta BB, Atawneh S, Meulenberg A, Almomani E (2013) A survey of phishing email filtering techniques. IEEE Commun Surv & Tutor 15(4):2070–2090
Article Google Scholar
Rao H, Shi X, Rodrigue AK, Feng J, Xia Y, Elhoseny M, Gu L (2019) Feature selection based on artificial bee colony and gradient boosting decision tree. Appl Soft Comput 74:634–642
Article Google Scholar
Abdelaziz A, Elhoseny M, Salama AS, Riad AM (2018) A machine learning model for improving healthcare services on cloud computing environment. Measurement 119:117–128
Article Google Scholar
Vinayakumar R, Poornachandran P, Soman KP (2018) Scalable framework for cyber threat situational awareness based on domain name systems data analysis. In: Big data in engineering applications. Springer, Singapore, pp 113–142
Google Scholar
Mohan VS, Vinayakumar R, Soman KP, Poornachandran P (2018) Spoof net: syntactic patterns for identification of ominous online factors. In: 2018 IEEE security and privacy workshops (SPW). IEEE, New York, pp 258–263
Google Scholar
Vinayakumar R, Soman KP, Poornachandran P (2018) Detecting malicious domain names using deep learning approaches at scale. J Intell & Fuzzy Syst 34(3):1355–1367
Article Google Scholar
Vinayakumar R, Soman KP, Poornachandran P, Mohan VS, Kumar AD (2019) ScaleNet: scalable and hybrid framework for cyber threat situational awareness based on DNS, URL, and Email data analysis. J Cyber Secur Mobil 8(2):189–240
Article Google Scholar
Vinayakumar R, Soman KP, Poornachandran P (2018) Evaluating deep learning approaches to characterize and classify malicious URLs. J Intell & Fuzzy Syst 34(3):1333–1343
Article Google Scholar
Harikrishnan NB, Vinayakumar R, Soman KP, A machine learning approach towards phishing Email detection. In: CEN-Security@IWSPA 2018, pp 22–29. http://ceur-ws.org/Vol-2124/paper7
Vinayakumar R, Barathi Ganesh HB, Anand Kumar M, Soman KP, DeepAnti-PhishNet: applying deep neural networks for phishing email detection. In: CEN-AISecurity@IWSPA-2018, pp 40–50. http://ceur-ws.org/Vol-2124/paper9
Barathi Ganesh HB, Vinayakumar R, Soman KP, Anand Kumar M, Distributed representation using target classes: bag of tricks for security and privacy analytics. In: Amrita-NLP@IWSPA 2018, pp 11–16. http://ceur-ws.org/Vol-2124/paper10
Vazhayil A, Harikrishnan NB, Vinayakumar R, Soman KP, PED-ML: Phishing email detection using classical machine learning techniques. In: CENSec@Amrita, pp 70–77. http://ceur-ws.org/Vol-2124/paper11
Unnithan NA, Harikrishnan NB, Akarsh S, Vinayakumar R, Soman KP, Machine learning based phishing e-mail detection. In: Security-CEN@Amrita, pp 65–69. http://ceur-ws.org/Vol-2124/paper12
Moha VS, Naveen JR, Vinayakumar R, Soman KP, A.R.E.S : Automatic rogue email spotter crypt coyotes, pp 58–64. http://ceur-ws.org/Vol-2124/paper13
Hiransha M, Unnithan NA, Vinayakumar R, Soman KP, Deep learning based phishing E-mail detection CEN-Deepspam, pp 17–21. http://ceur-ws.org/Vol-2124/paper16
Unnithan NA, Harikrishnan NB, Vinayakumar R, Soman KP, Detecting phishing E-mail using machine learning techniques. In: CEN-SecureNLP, pp 51–57. http://ceur-ws.org/Vol-2124/paper17
Vinayakumar R, Soman KP, Poornachandran P (2017) Applying convolutional neural network for network intrusion detection. In: 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, New York, pp 1222–1228
Google Scholar
Vinayakumar R, Soman KP, Poornachandran P (2017) Evaluating effectiveness of shallow and deep networks to intrusion detection system. In: 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, New York, pp 1282–1289
Google Scholar
Vinayakumar R, Soman KP, Poornachandran P (2017) Evaluation of recurrent neural network and its variants for intrusion detection system (IDS). Int J Inf Syst Model Des (IJISMD) 8(3):43–63
Article Google Scholar
Vinayakumar R, Soman KP, Poornachandran P (2017) Applying deep learning approaches for network traffic prediction. In: 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, New York, pp 2353–2358
Google Scholar
Vinayakumar R, Soman KP, Poornachandran P (2017) Secure shell (ssh) traffic analysis with flow based features using shallow and deep networks. In: 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, New York, pp 2026–2032
Google Scholar
Vinayakumar R, Soman KP, Poornachandran P (2017) Evaluating shallow and deep networks for secure shell (ssh) traffic analysis. In: 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, New York, pp 266–274
Google Scholar
Vinayakumar R, Soman KP (2018) DeepMalNet: evaluating shallow and deep networks for static PE malware detection. ICT Express
Google Scholar
Vinayakumar R, Soman KP, Poornachandran P (2017) Deep android malware detection and classification. In: 2017 international conference on advances in computing, communications and informatics (ICACCI). IEEE, New York, pp 1677–1683
Google Scholar
Elhoseny H, Elhoseny M, Riad AM, Hassanien AE (2018) A framework for big data analysis in smart cities. In: International conference on advanced machine learning technologies and applications. Springer, Cham, pp 405–414
Chapter Google Scholar
Clark J, Koprinska I, Poon J (2003). A neural network based approach to automated e-mail classification. In: IEEE/WIC international conference on web intelligence, 2003. WI 2003. Proceedings. IEEE, New York, pp 702–705
Google Scholar
Ruan G, Tan Y (2010) A three-layer back-propagation neural network for spam detection using artificial immune concentration. Soft Comput 14(2):139–150
Article Google Scholar
Lennan C, Naber B, Reher J, Weber L, End-to-end spam classification with neural networks
Google Scholar
Eugene L, Caswell I, Making a manageable email experience with deep learning
Google Scholar
Bluszcz J, Fitisova D, Hamann A, Trifonov A (2016) Application of support vector machine algorithm in e-mail spam filtering (Patrick J’ahnichen, Preprint submitted to Patrick J’anichen, Advisor)
Google Scholar
Mbah KF, Lashkari AH, Ghorbani AA (2017) A phishing email detection approach using machine learning techniques. World Acad Sci Eng Technol Int J Comput Inf Eng 4(1)
Google Scholar
Hamid IRA, Abawajy J, Kim TH (2013) Using feature selection and classification scheme for automating phishing email detection. Stud Inform Control 22(1):61–70
Article Google Scholar
Yasin A, Abuhasan A (2016) An intelligent classification model for phishing email detection. arXiv preprint arXiv:1608.02196
Rashwan MA, Al Sallab AA (2012) E-mail classification using deep networks. J Theor Appl Inf 37(2):241–251
Google Scholar
Hassanpour R, Dogdu E, Choupani R, Goker O, Nazli N (2018) Phishing E-mail detection by using deep learning algorithms. In: Proceedings of the ACMSE 2018 Conference. ACM, New York, p 45
Google Scholar
Rawal S, Rawal B, Shaheen A, Malik S, Phishing detection in E-mails using machine learning
Google Scholar
Smadi S, Aslam N, Zhang L, Alasem R, Hossain MA (2015) Detection of phishing emails using data mining algorithms. In: 2015 9th international conference on software, knowledge, information management and applications (SKIMA). IEEE, New York, pp 1–8
Google Scholar
Zhang N, Yuan Y (2012) Phishing detection using neural network. CS229 lecture notes
Google Scholar
Sananse BE, Sarode TK (2015) Phishing URL detection: a machine learning and web mining-based approach. Int J Comput Appl 123(13)
Google Scholar
Varshney G, Misra M, Atrey PK (2016) A survey and classification of web phishing detection schemes. Secur Commun Netw 9(18):6266–6284
Article Google Scholar
Abdi FD, Wenjuan L Malicious URL detection using convolutional neural network
Google Scholar
Sahoo D, Liu C, Hoi SC (2017) Malicious URL detection using machine learning: a survey. arXiv preprint arXiv:1701.07179
Feroz MN (2015) Examination of data, and detection of phishing URLs using URL ranking (Doctoral dissertation)
Google Scholar
Bahnsen AC, Bohorquez EC, Villegas S, Vargas J, Gonzlez FA (2017) Classifying phishing URLs using recurrent neural networks. In: 2017 APWG symposium on electronic crime research (eCrime). IEEE, New York, pp 1–8
Google Scholar
Le H, Pham Q, Sahoo D, Hoi SC (2018) URLNet: learning a URL representation with deep learning for malicious URL detection. arXiv preprint arXiv:1802.03162
Ketari LM, Chandra M, Khanum MA (2012) A study of image spam filtering techniques. In: 2012 fourth international conference on computational intelligence and communication networks (CICN). IEEE, New York, pp 245–250
Google Scholar
Bekkerman R (2004) Automatic categorization of email into folders: benchmark experiments on Enron and SRI corpora
Google Scholar
Yang J, Park SY (2002) Email categorization using fast machine learning algorithms. In: International conference on discovery science. Springer, Berlin, Heidelberg, pp 316–323
Chapter Google Scholar
Mock K (2001) An experimental framework for email categorization and management. In: Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval. ACM, New York, pp 392–393
Google Scholar
Islam MR, Zhou W (2007) Email categorization using multi-stage classification technique. In: Eighth international conference on parallel and distributed computing, applications and technologies, 2007. PDCAT’07. IEEE, New York, pp 51–58
Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436
Article Google Scholar

Download references

Acknowledgements

This research was supported in part by Paramount Computer Systems and Lakhshya Cyber Security Labs. We are grateful to NVIDIA India, for the GPU hardware support to research grant. We are also grateful to Computational Engineering and Networking (CEN) department for encouraging the research.

Author information

Authors and Affiliations

Center for Computational Engineering and Networking (CEN), Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Coimbatore, India
R. Vinayakumar, K. P. Soman & S. Akarsh
Centre for Cyber Security Systems and Networks, Amrita School of Engineering, Amrita Vishwa Vidyapeetham, Amritapuri, India
Prabaharan Poornachandran
Department of Information Systems, Faculty of Computer and Information, Mansoura University, Mansoura, Egypt
Mohamed Elhoseny

Authors

R. Vinayakumar
View author publications
You can also search for this author in PubMed Google Scholar
K. P. Soman
View author publications
You can also search for this author in PubMed Google Scholar
Prabaharan Poornachandran
View author publications
You can also search for this author in PubMed Google Scholar
S. Akarsh
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Elhoseny
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to R. Vinayakumar .

Editor information

Editors and Affiliations

Faculty of Computers and Information, Cairo University, Giza, Egypt
Aboul Ella Hassanien
Faculty of Computer and Information Sciences, Mansoura University, Mansoura, Egypt
Mohamed Elhoseny

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Vinayakumar, R., Soman, K.P., Prabaharan Poornachandran, Akarsh, S., Elhoseny, M. (2019). Deep Learning Framework for Cyber Threat Situational Awareness Based on Email and URL Data Analysis. In: Hassanien, A., Elhoseny, M. (eds) Cybersecurity and Secure Information Systems. Advanced Sciences and Technologies for Security Applications. Springer, Cham. https://doi.org/10.1007/978-3-030-16837-7_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-16837-7_6
Published: 20 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-16836-0
Online ISBN: 978-3-030-16837-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics