Packer Identification Using Hidden Markov Model

Hai, Nguyen Minh; Tho, Quan Thanh

doi:10.1007/978-3-319-69456-6_8

Nguyen Minh Hai¹⁶ &
Quan Thanh Tho¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10607))

Included in the following conference series:

International Workshop on Multi-disciplinary Trends in Artificial Intelligence

1596 Accesses

Abstract

Most of modern malware are packed by packers to evade the anti-virus software. Basically, packers will apply various obfuscating techniques to hide their true behaviors from static analysis methods. Thus, how to deal with packed malware has always been a tough problem so far. This paper proposes a novel approach for packer detection using a combination of BE-PUM tool and Hidden Markov Model. First, BE-PUM tool is applied to detect the sequence of possible obfuscation techniques embedded in the analyzed binary program. Then, Hidden Markov Model is used to effectively identify the possibility of packer existence from the generated sequences. As Hidden Markov is very effective for pattern recognition, our proposed technique can accurately identify the packers deployed in binaries files. We have performed experiments on more than 2000 real-world malwares taken from VirusShare. The result is very promising.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://www.aspack.com.
2.
http://fsg.soft112.com.
3.
https://bitsum.com/pecompact.
4.
http://www.telock.com-about.com.
5.
http://upx.sourceforge.net.
6.
http://www.yodas-crypter.com-about.com.
7.
https://virusshare.com/.
8.
https://www.aldeid.com/wiki/PEiD.
9.
http://www.ntcore.com/exsuite.php.
10.
http://www.joestewart.org/ollybone.
11.
http://www.ollydbg.de.
12.
http://bitblaze.cs.berkeley.edu/temu.html.
13.
https://virustotal.com/.

References

McAfee: The good, the bad, and the unknown. http://www.techdata.com/mcafee/files/MCAFEE_wp_appcontrol-good-bad-unknown.pdf. Accessed 21 May 2017
Santos, I., Ugarte-Pedrero, X., Sanz, B., Laorden, C., Bringas, P.G.: Collective classification for packed executable identification. In: Proceedings of the 8th Annual Collaboration, Electronic Messaging, Anti-Abuse and Spam Conference, Australia, pp. 23–30 (2011)
Google Scholar
Al-Anezi, M.M.K.: Generic packing detection using several complexity analysis for accurate malware detection int. J. Adv. Comput. Sci. 3, 32–39 (2016)
Google Scholar
Osaghae, E.O.: Classifying packed programs as malicious software detected. Int. J. Inf. Technol. Electr. Eng. 5, 22–25 (2016)
Google Scholar
Nguyen, M.H., Nguyen, T.B., Quan, T.T., Ogawa, M.: A hybrid approach for control flow graph construction from binary code. In: IEEE APSEC, pp. 159–164 (2013)
Google Scholar
Hai, N.M., Ogawa, M., Tho, Q.T.: Obfuscation code localization based on CFG generation of malware. In: Garcia-Alfaro, J., Kranakis, E., Bonfante, G. (eds.) FPS 2015. LNCS, vol. 9482, pp. 229–247. Springer, Cham (2016). doi:10.1007/978-3-319-30303-1_14
Chapter Google Scholar
Morgenstern, M., Marx, A.: Runtime packer testing experiences. In: CARO. LNCS, vol. 6174, pp. 288–305 (2008)
Google Scholar
Kang, M.G., Poosankam, P., Yin, H.: Renovo: a hidden code extractor for packed executables. In: ACM WORM, pp. 46–53 (2007)
Google Scholar
Bonfante, G., Fernez, J., Marion, J.-Y., Rouxel, B., Sabatier, F., Thierry, A.: CoDisasm: medium scale concatic disassembly of self-modifying binaries with overlapping instructions. In: ACM SIGSAC CCS, pp. 46–53 (2015)
Google Scholar
Roundy, K.A., Miller, B.P.: Binary-code obfuscations in prevalent packer tools. ACM Comput. Surv. 46, 1–32 (2013)
Article Google Scholar
Song, D., et al.: BitBlaze: a new approach to computer security via binary analysis. In: Sekar, R., Pujari, A.K. (eds.) ICISS 2008. LNCS, vol. 5352, pp. 1–25. Springer, Heidelberg (2008). doi:10.1007/978-3-540-89862-7_1
Chapter Google Scholar
Anti-virus technology whitepaper. Technical report, BitDefender (2007)
Google Scholar
Nguyen, M.H., Tho, Q.T.: An experimental study on identifying obfuscation techniques in packer. In: 5th World Conference on Applied Sciences, Engineering and Technology (WCSET), 02–04 June 2016, HCMUT, Vietnam (2016). ISBN 978-81-930222-2-1
Google Scholar
Thakur, A., Lim, J., Lal, A., Burton, A., Driscoll, E., Elder, M., Andersen, T., Reps, T.: Directed proof generation for machine code. In: Touili, T., Cook, B., Jackson, P. (eds.) CAV 2010. LNCS, vol. 6174, pp. 288–305. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14295-6_27
Chapter Google Scholar
Kinder, J.: Static analysis of x86 executables. Ph.D. thesis, Technische Universitat Darmstadt (2010)
Google Scholar
Kinder, J., Kravchenko, D.: Alternating control flow reconstruction. In: Kuncak, V., Rybalchenko, A. (eds.) VMCAI 2012. LNCS, vol. 7148, pp. 267–282. Springer, Heidelberg (2012). doi:10.1007/978-3-642-27940-9_18
Chapter Google Scholar
Kinder, J., Zuleger, F., Veith, H.: An abstract interpretation-based framework for control flow reconstruction from binaries. In: Jones, N.D., Müller-Olm, M. (eds.) VMCAI 2009. LNCS, vol. 5403, pp. 214–228. Springer, Heidelberg (2008). doi:10.1007/978-3-540-93900-9_19
Chapter Google Scholar
Rabiner, L.R., Juang, H.: Hidden Markov models for speech recognition - strengths and limitations. In: Laface, P., De Mori, R. (eds.) Speech Recognition and Understanding. NATO ASI Series, vol. 75, pp. 3–29. Springer, Heidelberg (1992). doi:10.1007/978-3-642-76626-8_1
Chapter Google Scholar
Kunda, A., He, Y., Bahl, P.: Handwritten word recognition: a hidden Markov model based approach. In: pattern recognition, pp. 283–297, May 1989
Google Scholar
Rimey, R.D., Brown, C.M.: Selective attention as sequential behavior: modeling eye movements with an augmented hidden Markov model. In: Proceedings of the DARPA Image Understanding Workshop, pp. 840–649 (1990)
Google Scholar
Bakis, R.: Continuous speech word recognition via centisecond acoustic states. In: Proceedings of ASA Meeting, Washington, D.C., April 1976
Google Scholar
Forney, G.D.: The Viterbi algorithm. Proc. IEEE 61, 268–278 (1973)
Article MathSciNet Google Scholar
Singhal, A.: Modern information retrieval a brief overview. Bull. IEEE Comput. Soc. Techn. Comm. Data Eng. 24, 35–43 (2001)
Google Scholar
Hai, N.M., Tho, Q.T., Anh, L.D.: Multi-threaded on-the-fly model generation of malware with hash compaction. In: Ogata, K., Lawford, M., Liu, S. (eds.) ICFEM 2016. LNCS, vol. 10009, pp. 159–174. Springer, Cham (2016). doi:10.1007/978-3-319-47846-3_11
Chapter Google Scholar

Download references

Acknowledgments

This research is funded by Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 102.01-2015.16.

Author information

Authors and Affiliations

Ho Chi Minh City University of Technology, Ho Chi Minh City, Vietnam
Nguyen Minh Hai & Quan Thanh Tho

Authors

Nguyen Minh Hai
View author publications
You can also search for this author in PubMed Google Scholar
Quan Thanh Tho
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nguyen Minh Hai .

Editor information

Editors and Affiliations

Universiti Teknologi Brunei, Gadong, Brunei Darussalam
Somnuk Phon-Amnuaisuk
Universiti Teknologi Brunei, Gadong, Brunei Darussalam
Swee-Peng Ang
Korea Advanced Institute of Science and Technology, Daejeon, Korea (Republic of)
Soo-Young Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hai, N.M., Tho, Q.T. (2017). Packer Identification Using Hidden Markov Model. In: Phon-Amnuaisuk, S., Ang, SP., Lee, SY. (eds) Multi-disciplinary Trends in Artificial Intelligence. MIWAI 2017. Lecture Notes in Computer Science(), vol 10607. Springer, Cham. https://doi.org/10.1007/978-3-319-69456-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-69456-6_8
Published: 19 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69455-9
Online ISBN: 978-3-319-69456-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Packer Identification Using Hidden Markov Model