Collaborative Learning Based Effective Malware Detection System

Singh, Narendra; Kasyap, Harsh; Tripathy, Somanath

doi:10.1007/978-3-030-65965-3_13

Narendra Singh³⁵,
Harsh Kasyap³⁵ &
Somanath Tripathy³⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1323))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

2460 Accesses
3 Citations

Abstract

Malware is overgrowing, causing severe loss to different institutions. The existing techniques, like static and dynamic analysis, fail to mitigate newly generated malware. Also, the signature, behavior, and anomaly-based defense mechanisms are susceptible to obfuscation and polymorphism attacks. With machine learning in practice, several authors proposed different classification and visualization techniques for malware detection. Images have proved worth analyzing the behavior of malware. Deep neural networks extract much information from it without having expert domain knowledge. On the other hand, the scarcity of diverse malware data available with clients, and their privacy concerns about sharing data with a centralized curator makes it challenging to build a more reliable model. This paper proposes a lightweight Convolution Neural Network (CNN) based model extracting relevant features using call graph, n-gram, and image transformations. Further, Auxiliary Classifier Generative Adversarial Network (AC-GAN) is used for generating unseen data for training purposes. The model is extended for federated setup to build an effective malware detection system. We have used the Microsoft malware dataset for training and evaluation. The result shows that the federated approach achieves the accuracy closer to centralized training while preserving data privacy at an individual organization.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://www.kaggle.com/c/malware-classification/data.
2.
https://github.com/tensorflow/federated/blob/master/docs/install.md.
3.
Tensorflow federated. https://www.tensorflow.org/federated.

References

Moser, A., Kruegel, C., Kirda, E.: Limits of static analysis for malware detection. In: Twenty-Third Annual Computer Security Applications Conference, ACSAC 2007, pp. 421–430, December 2007
Google Scholar
Shijo, P., Salim, A.: Integrated static and dynamic analysis for malware detection. Proc. Comput. Sci. 46, 804–811 (2015)
Article Google Scholar
Carlin, D., Cowan, A., O’Kane, P., Sezer, S.: The effects of traditional anti-virus labels on malware detection using dynamic runtime opcodes. IEEE Access 5, 17 742–17 752 (2017)
Google Scholar
Harel, D. (ed.): First-Order Dynamic Logic. LNCS, vol. 68. Springer, Heidelberg (1979). https://doi.org/10.1007/3-540-09237-4
Book MATH Google Scholar
Pechaz, B., Jahan, M.V., Jalali, M.: Malware detection using hidden Markov model based on Markov blanket feature selection method. In: 2015 International Congress on Technology, Communication and Knowledge (ICTCK), pp. 558–563, November 2015
Google Scholar
Liu, C., Zhang, Z., Wang, S.: An android malware detection approach using Bayesian inference. In: 2016 IEEE International Conference on Computer and Information Technology (CIT), pp. 476–483, December 2016
Google Scholar
Rathore, H., Agarwal, S., Sahay, S.K., Sewak, M.: Malware detection using machine learning and deep learning. CoRR, vol. abs/1904.02441 (2019). http://arxiv.org/abs/1904.02441
Yousefi-Azar, M., Varadharajan, V., Hamey, L., Tupakula, U.: Autoencoder-based feature learning for cyber security applications. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 3854–3861, May 2017
Google Scholar
Zhao, Y., Xu, C., Bo, B., Feng, Y.: MalDeep: a deep learning classification framework against malware variants based on texture visualization. Secur. Commun. Netw. 2019, 1–12 (2019)
Google Scholar
Lu, R.: Malware detection with LSTM using opcode language. ArXiv, vol. abs/1906.04593 (2019)
Google Scholar
McMahan, H.B., Moore, E., Ramage, D., y Arcas, B.A.: Federated learning of deep networks using model averaging. CoRR, vol. abs/1602.05629 (2016). http://arxiv.org/abs/1602.05629
Le, Q., Boydell, O., Namee, B.M., Scanlon, M.: Deep learning at the shallow end: malware classification for non-domain experts. CoRR, vol. abs/1807.08265 (2018). http://arxiv.org/abs/1807.08265
Vasan, D., Alazab, M., Wassan, S., Naeem, H., Safaei, B., Zheng, Q.: IMCFN: image-based malware classification using fine-tuned convolutional neural network architecture. Comput. Netw. 107138 (2020). http://www.sciencedirect.com/science/article/pii/S1389128619304736
Goodfellow, I., et al.: Generative adversarial nets. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 27, pp. 2672–2680. Curran Associates Inc. (2014). http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
Kim, J.Y., Bu, S.J., Cho, S.B.: Malware detection using deep transferred generative adversarial networks. In: Liu, D., Xie, S., Li, Y., Zhao, D., El-Alfy, E.S. (eds.) ICONIP 2017. LNCS, vol. 10634, pp. 556–564. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70087-8_58
Chapter Google Scholar
Hu, W., Tan, Y.: Generating adversarial malware examples for black-box attacks based on GAN. CoRR, vol. abs/1702.05983 (2017). http://arxiv.org/abs/1702.05983
Sewak, M., Sahay, S.K., Rathore, H.: An investigation of a deep learning based malware detection system. CoRR, vol. abs/1809.05888 (2018). http://arxiv.org/abs/1809.05888
Shamir, O., Srebro, N., Zhang, T.: Communication efficient distributed optimization using an approximate newton-type method. CoRR, vol. abs/1312.7853 (2013). http://arxiv.org/abs/1312.7853
Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning, Series Proceedings of Machine Learning Research, 06–11 Aug 2017, vol. 70, pp. 2642–2651. International Convention Centre. PMLR, Sydney, Australia. http://proceedings.mlr.press/v70/odena17a.html
Jiang, H., Turki, T., Wang, J.T.L.: DLGraph: malware detection using deep learning and graph embedding. In: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA), pp. 1029–1033 (2018)
Google Scholar

Download references

Acknowledgement

We acknowledge the Ministry of Human Resource Development, Government of India, for providing fellowship to complete this work.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Indian Institute of Technology Patna, Patna, India
Narendra Singh, Harsh Kasyap & Somanath Tripathy

Authors

Narendra Singh
View author publications
You can also search for this author in PubMed Google Scholar
Harsh Kasyap
View author publications
You can also search for this author in PubMed Google Scholar
Somanath Tripathy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Harsh Kasyap .

Editor information

Editors and Affiliations

University of Sydney, Sydney, NSW, Australia
Irena Koprinska
Monash University, Clayton, VIC, Australia
Michael Kamp
University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
University of Bari Aldo Moro, Bari, Italy
Corrado Loglisci
University of Guelph, Guelph, ON, Canada
Luiza Antonie
University of Caen Normandy, Caen, France
Albrecht Zimmermann
University of Pisa, Pisa, Italy
Riccardo Guidotti
Norwegian University of Science and Technology, Trondheim, Norway
Özlem Özgöbek
University of Porto, Porto, Portugal
Rita P. Ribeiro
UPC BarcelonaTech, Barcelona, Spain
Ricard Gavaldà
University of Porto, Porto, Portugal
João Gama
Fraunhofer IAIS, St. Augustin, Germany
Linara Adilova
Royal Holloway University of London, Egham, UK
Yamuna Krishnamurthy
University of Lisbon, Lisbon, Portugal
Pedro M. Ferreira
University of Bari Aldo Moro, Bari, Italy
Donato Malerba
University of Lisbon, Lisbon, Portugal
Ibéria Medeiros
University of Bari Aldo Moro, Bari, Italy
Michelangelo Ceci
ICAR-CNR, Rende, Italy
Giuseppe Manco
University of Naples Federico II, Naples, Italy
Elio Masciari
University of North Carolina, Charlotte, NC, USA
Zbigniew W. Ras
Australian National University, Canberra, ACT, Australia
Peter Christen
Leibniz University Hannover, Hannover, Germany
Eirini Ntoutsi
Technical University of Dortmund, Dortmund, Germany
Erich Schubert
University of Southern Denmark, Odense, Denmark
Arthur Zimek
University of Pisa, Pisa, Italy
Anna Monreale
Warsaw University of Technology, Warsaw, Poland
Przemyslaw Biecek
ISTI-CNR, PISA, Italy
Salvatore Rinzivillo
Berlin Institute of Technology, Berlin, Germany
Benjamin Kille
Berlin Institute of Technology, Berlin, Germany
Andreas Lommatzsch
Norwegian University of Science and Technology, Trondheim, Norway
Jon Atle Gulla

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Singh, N., Kasyap, H., Tripathy, S. (2020). Collaborative Learning Based Effective Malware Detection System. In: Koprinska, I., et al. ECML PKDD 2020 Workshops. ECML PKDD 2020. Communications in Computer and Information Science, vol 1323. Springer, Cham. https://doi.org/10.1007/978-3-030-65965-3_13

Download citation

DOI: https://doi.org/10.1007/978-3-030-65965-3_13
Published: 02 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65964-6
Online ISBN: 978-3-030-65965-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

the ECML PKDD community (opens in a new tab)