Skip to main content

Probabilistic Inference on Integrity for Access Behavior Based Malware Detection

  • Conference paper
  • First Online:
Research in Attacks, Intrusions, and Defenses (RAID 2015)

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 9404))

Included in the following conference series:

Abstract

Integrity protection has proven an effective way of malware detection and defense. Determining the integrity of subjects (programs) and objects (files and registries) plays a fundamental role in integrity protection. However, the large numbers of subjects and objects, and intricate behaviors place burdens on revealing their integrities either manually or by a set of rules. In this paper, we propose a probabilistic model of integrity in modern operating system. Our model builds on two primary security policies, “no read down” and “no write up”, which make connections between observed access behaviors and the inherent integrity ordering between pairs of subjects and objects. We employ a message passing based inference to determine the integrity of subjects and objects under a probabilistic graphical model. Furthermore, by leveraging a statistical classifier, we build an integrity based access behavior model for malware detection. Extensive experimental results on a real-world dataset demonstrate that our model is capable of detecting 7,257 malware samples from 27,840 benign processes at 99.88 % true positive rate under 0.1 % false positive rate. These results indicate the feasibility of our probabilistic integrity model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In fact, we find s is about 7 in our experiments.

References

  1. Anderson, R.: Security Engineering: A Guide to Building Dependable Distributed Systems. John Wiley & Sons (2008)

    Google Scholar 

  2. Apap, F., Honig, A., Hershkop, S., Eskin, E., Stolfo, S.J.: Detecting malicious software by monitoring anomalous windows registry accesses. In: Wespi, A., Vigna, G., Deri, L. (eds.) RAID 2002. LNCS, vol. 2516, p. 36. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  3. Bellovin, S.M.: Security and usability: windows vista, July 2007. https://www.cs.columbia.edu/ smb/blog/2007-07/2007-07-13.html

  4. Biba, K.J.: Integrity considerations for secure computer systems. ESD-TR 76–372, MITRE Corp. (1977)

    Google Scholar 

  5. Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  6. Canali, D., Lanzi, A., Balzarotti, D., Kruegel, C., Christodorescu, M., Kirda, E.: A quantitative study of accuracy in system call-based malware detection. In: Proceedings of the 2012 International Symposium on Software Testing and Analysis, pp. 122–132. ACM (2012)

    Google Scholar 

  7. Fraser, T.: Lomac: low water-mark integrity protection for cots environments. In: IEEE Symposium on Security and Privacy (S&P), pp. 230–245 (2000)

    Google Scholar 

  8. Fredrikson, M., Jha, S., Christodorescu, M., Sailer, R., Yan, X.: Synthesizing near-optimal malware specifications from suspicious behaviors. In: IEEE Symposium on Security and Privacy (S&P), pp. 45–60 (2010)

    Google Scholar 

  9. Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian data analysis, vol. 2. Taylor & Francis (2014)

    Google Scholar 

  10. Gu, Z., Pei, K., Wang, Q., Si, L., Zhang, X., Xu, D.: LEAPS: detecting camouflaged attacks with statistical learning guided by program analysis. In: 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE (2015)

    Google Scholar 

  11. How the integrity mechanism is implemented in Windows Vista (2014). http://msdn.microsoft.com/en-us/library/bb625962.aspx,

  12. Hsu, F., Chen, H., Ristenpart, T., Li, J., Su, Z.: Back to the future: a framework for automatic malware removal and system repair. In: 22nd Annual Computer Security Applications Conference, ACSAC 2006, pp. 257–268. IEEE (2006)

    Google Scholar 

  13. King, S.T., Chen, P.M.: Backtracking intrusions. ACM Trans. Comput. Syst. 23, 51–76 (2005)

    Article  Google Scholar 

  14. Koller, D., Friedman, N.: Probabilistic graphical models: principles and techniques. MIT press (2009)

    Google Scholar 

  15. Kruegel, C., Kirda, E., Mutz, D., Robertson, W., Vigna, G.: Automating mimicry attacks using static binary analysis. In: Proceedings of the 14th conference on USENIX Security Symposium, vol. 14, pp. 11–11. USENIX Association (2005)

    Google Scholar 

  16. Kruegel, C., Mutz, D., Valeur, F., Vigna, G.: On the detection of anomalous system call arguments. In: Snekkenes, E., Gollmann, D. (eds.) ESORICS 2003. LNCS, vol. 2808, pp. 326–343. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  17. Lanzi, A., Balzarotti, D., Kruegel, C., Christodorescu, M., Kirda, E.: Accessminer: using system-centric models for malware protection. In: Proceedings of the 17th ACM conference on Computer and Communications Security (CCS), pp. 399–412. ACM (2010)

    Google Scholar 

  18. Manadhata, P.K., Yadav, S., Rao, P., Horne, W.: Detecting malicious domains via graph inference. In: Kutyłowski, M., Vaidya, J. (eds.) ICAIS 2014, Part I. LNCS, vol. 8712, pp. 1–18. Springer, Heidelberg (2014)

    Google Scholar 

  19. Mandatory Integrity Control (2014). http://msdn.microsoft.com/en-us/library/windows/desktop/bb648648

  20. Mao, W., Cai, Z., Guan, X., Towsley, D.: Centrality metrics of importance in access behaviors and malware detections. In: Proceedings of the 30th Annual Computer Security Applications Conference (ACSAC 2014). ACM (2014)

    Google Scholar 

  21. Mao, Z., Li, N., Chen, H., Jiang, X.: Combining discretionary policy with mandatory information flow in operating systems. ACM Trans. Inf. Syst. Secur. (TISSEC) 14(3), 24 (2011)

    Article  Google Scholar 

  22. Mark Russinovich, B.C.: Process monitor (2014). http://technet.microsoft.com/en-us/sysinternals/bb896645

  23. Muthukumaran, D., Rueda, S., Talele, N., Vijayakumar, H., Teutsch, J., Jaeger, T., Edwards, N.: Transforming commodity security policies to enforce Clark-Wilson integrity. In: Proceedings of the 28th Annual Computer Security Applications Conference (ACSAC 2012). ACM (2012)

    Google Scholar 

  24. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., et al.: Scikit-learn: machine learning in python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  25. Sun, W., Sekar, R., Liang, Z., Venkatakrishnan, V.N.: Expanding malware defense by securing software installations. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 164–185. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  26. Sun, W., Sekar, R., Poothia, G., Karandikar, T.: Practical proactive integrity preservation: a basis for malware defense. In: IEEE Symposium on Security and Privacy (S&P), pp. 248–262 (2008)

    Google Scholar 

  27. Symantec. Internet Security Threat Report, April 2015. https://www4.symantec.com/mktginfo/whitepaper/ISTR/21347932_GA-internet-security-threat-report-volume-20-2015-social_v2.pdf

  28. Sze, W.-K., Sekar, R.: A portable user-level approach for system-wide integrity protection. In: Proceedings of the 29th Annual Computer Security Applications Conference (ACSAC 2013), pp. 219–228. ACM (2013)

    Google Scholar 

  29. Tamersoy, A., Roundy, K., Chau, D.H.: Guilt by association: large scale malware detection by mining file-relation graphs. In: Proceedings of the 20th ACM SIGKDD international conference on Knowledge Discovery and Data Mining, pp. 1524–1533. ACM (2014)

    Google Scholar 

  30. VXHeaven (2010). http://vx.netlux.org/

Download references

Acknowledgments

We would like to thank our shepherd, Manos Antonakakis, and the anonymous reviewers for their insightful comments that greatly helped improve the presentation of this paper. This work is supported by NFSC (61175039, 61221063, 61403301), 863 High Tech Development Plan (2012AA011003), Research Fund for Doctoral Program of Higher Education of China (20090201120032), International Research Collaboration Project of Shaanxi Province (2013KW11) and Fundamental Research Funds for Central Universities (2012jdhz08). Any opinions, findings, and conclusions or recommendations expressed in this material are the authors’ and do not necessarily reflect those of the sponsor.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhongmin Cai .

Editor information

Editors and Affiliations

Appendix- Derivation of Eq. (8)

Appendix- Derivation of Eq. (8)

\(P(E_I|Acc)\propto \sum _{T}{P(Acc|T)P(T|E_I)\sum _{D}{P(E_I|D)P(D)}}\), where

$$\begin{aligned} \sum _{D}{P(E_I|D)P(D)}= {\left\{ \begin{array}{ll} \sum _{D}{d_1P(D)}=\mathbb {E}_D(d_1)=\frac{\alpha _1}{\alpha _1+\alpha _2+\alpha _3}, &{} \text {if } I(s)<I(o), \\ \sum _{D}{d_2P(D)}=\mathbb {E}_D(d_2)=\frac{\alpha _2}{\alpha _1+\alpha _2+\alpha _3}, &{} \text {if } I(s)=I(o), \\ \sum _{D}{d_3P(D)}=\mathbb {E}_D(d_3)=\frac{\alpha _3}{\alpha _1+\alpha _2+\alpha _3}, &{} \text {if } I(s)>I(o). \end{array}\right. } \end{aligned}$$
(18)

And then,

  1. (1.)

    If \(I(s)<I(o)\):

    $$ \begin{aligned} P(<|Acc)\propto & {} \frac{\alpha _1}{\alpha _1+\alpha _2+\alpha _3}\Delta \int _{T}{t_1^{N_r}t_2^{N_w}t_3^{N_{r \& w}} \frac{t_1^{\beta _1}t_2^{\beta _2-1}t_3^{\beta _3-1}}{B(1+\beta _1, \beta _2, \beta _3)}} \mathop {}\!\mathrm {d}T, \nonumber \\= & {} \frac{\alpha _1}{\alpha _1+\alpha _2+\alpha _3}\Delta \frac{B(N_r+\beta _1+1, N_w+\beta _2, N_{r \& w}\beta _3)}{B(1+\beta _1, \beta _2, \beta _3)}, \nonumber \\= & {} \frac{\alpha _1}{\alpha _1+\alpha _2+\alpha _3}\Delta \frac{\beta _1+\beta _2+\beta _3}{\beta _1}\frac{N_r+\beta _1}{N+\beta _1+\beta _2+\beta _3}\Omega , \end{aligned}$$
    (19)

    where \( \Delta =\frac{\Gamma (N+1)}{\Gamma (N_r+1)\Gamma (N_w+1)\Gamma (N_{r \& w}+1)}\), \( \Omega =\frac{B(N_r+\beta _1, N_w+\beta _2, N_{r \& w}+\beta _3)}{B(\beta _1, \beta _2, \beta _3)}\), and \(B(\beta _1, \beta _2, \beta _3)=\frac{\Gamma (\beta _1)\Gamma (\beta _2)\Gamma (\beta _3)}{\Gamma (\beta _1+\beta _2+\beta _3)}\).

  2. (2.)

    If \(I(s)=I(o)\):

    $$ \begin{aligned} P(=|Acc) \propto \frac{\alpha _2}{\alpha _1+\alpha _2+\alpha _3}\Delta \int _{T}{t_1^{N_r}t_2^{N_w}t_3^{N_{r \& w}} \frac{t_1^{\beta _1-1}t_2^{\beta _2-1}t_3^{\beta _3-1}}{B(\beta _1, \beta _2, \beta _3)}} \mathop {}\!\mathrm {d}T = \frac{\alpha _2}{\alpha _1+\alpha _2+\alpha _3}\Delta \Omega . \end{aligned}$$
    (20)
  3. (3.)

    If \(I(s)>I(o)\):

    $$ \begin{aligned} P(>|Acc)\propto & {} \frac{\alpha _3}{\alpha _1+\alpha _2+\alpha _3}\Delta \int _{T}{t_1^{N_r}t_2^{N_w}t_3^{N_{r \& w}} \frac{t_1^{\beta _1-1}t_2^{\beta _2}t_3^{\beta _3-1}}{B(\beta _1, \beta _2+1, \beta _3)}} \mathop {}\!\mathrm {d}T, \nonumber \\= & {} \frac{\alpha _3}{\alpha _1+\alpha _2+\alpha _3}\Delta \frac{\beta _1+\beta _2+\beta _3}{\beta _2}\frac{N_w+\beta _2}{N+\beta _1+\beta _2+\beta _3}\Omega . \nonumber \\ \end{aligned}$$
    (21)

Summing up Eqs. (19)–(21), we derive the posterior distribution of \(E_I\) given Acc, i.e., \(P(E_I|Acc)\), as shown in Eqs. (9)–(11).

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Mao, W., Cai, Z., Towsley, D., Guan, X. (2015). Probabilistic Inference on Integrity for Access Behavior Based Malware Detection. In: Bos, H., Monrose, F., Blanc, G. (eds) Research in Attacks, Intrusions, and Defenses. RAID 2015. Lecture Notes in Computer Science(), vol 9404. Springer, Cham. https://doi.org/10.1007/978-3-319-26362-5_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26362-5_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26361-8

  • Online ISBN: 978-3-319-26362-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics