Data Modelling for Predicting Exploits

Reinthal, Alexander; Filippakis, Eleftherios Lef; Almgren, Magnus

doi:10.1007/978-3-030-03638-6_21

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 11252))

Included in the following conference series:

Nordic Conference on Secure IT Systems

2395 Accesses
5 Citations

Abstract

Modern society is becoming increasingly reliant on secure computer systems. Predicting which vulnerabilities are more likely to be exploited by malicious actors is therefore an important task to help prevent cyber attacks. Researchers have tried making such predictions using machine learning. However, recent research has shown that the evaluation of such models require special sampling of training and test sets, and that previous models would have had limited utility in real world settings. This study further develops the results of recent research through the use of their sampling technique for evaluation in combination with a novel data model. Moreover, contrary to recent research, we find that using open web data can help in making better predictions about exploits, and that zero-day exploits are detrimental to the predictive powers of the model. Finally, we discovered that the initial days of vulnerability information is sufficient to make the best possible model. Given our findings, we suggest that more research should be devoted to develop refined techniques for building predictive models for exploits. Gaining more knowledge in this domain would not only help preventing cyber attacks but could yield fruitful insights in the nature of exploit development.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This percentage is estimated from Fig. 5 in their report [3].
2.
The \(\varDelta \) was computed from their reported class percentage of their test set which was \(16.7\%\) in their random split experiment and \(9.3\%\) in their temporally split model.

References

Allodi, L., Massacci, F.: Comparing vulnerability severity and exploits using case-control studies. ACM Trans. Inf. Syst. Secur. 17(1), 1:1–1:20 (2014). https://doi.org/10.1145/2630069
Article Google Scholar
Bozorgi, M., Saul, L.K., Savage, S., Voelker, G.M.: Beyond heuristics: learning to classify vulnerabilities and predict exploits. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, pp. 105–114. ACM, New York (2010). http://doi.acm.org/10.1145/1835804.1835821
Bullough, B.L., Yanchenko, A.K., Smith, C.L., Zipkin, J.R.: Predicting exploitation of disclosed software vulnerabilities using open-source data. In: Proceedings of the 3rd ACM on International Workshop on Security and Privacy Analytics, IWSPA 2017, pp. 45–53. ACM, New York (2017). http://doi.acm.org/10.1145/3041008.3041009
Chen, T., He, T., Benesty, M., et al.: Xgboost: extreme gradient boosting. R package version 0.4-2, pp. 1–4 (2015)
Google Scholar
Edkrantz, M., Said, A.: Predicting cyber vulnerability exploits with machine learning. In: SCAI (2015)
Google Scholar
Exploit-DB Offensive Securitys Exploit Database Archive. https://www.exploit-db.com/. Accessed 24 Aug 2017
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001). http://www.jstor.org/stable/2699986
Article MathSciNet Google Scholar
Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4), 44 (2014)
Article Google Scholar
National Vulnerability Database Computer Security Resource Center. https://nvd.nist.gov/. Accessed 24 Aug 2017
Recorded Future’s threat intelligence platform
Google Scholar
Roytman, M.: Quick Look: Predicting Exploitability, Forecasts for Vulnerability Management (2018). https://www.rsaconference.com/videos/quick-look-predicting-exploitabilityforecasts-for-vulnerability-management
Sabottke, C., Suciu, O., Dumitras, T.: Vulnerability disclosure in the age of social media: exploiting twitter for predicting real-world exploits. In: 24th USENIX Security Symposium. USENIX Association, Washington, D.C. (2015)
Google Scholar
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Mach. Learn. 23(1), 69–101 (1996)
Google Scholar

Download references

Acknowledgements

The research leading to these results has been partially supported by the Swedish Civil Contingencies Agency (MSB) through the project “RICS” and by the European Community’s Horizon 2020 Framework Programme through the UNITED-GRID project under grant agreement 773717.

We would also like to thank Staffan Truvé and Michel Edkrantz at Recorded Future for inspiration, access to data and the environment to perform the current study.

Author information

Authors and Affiliations

Chalmers University of Technology, Gothenburg, Sweden
Alexander Reinthal, Eleftherios Lef Filippakis & Magnus Almgren

Authors

Alexander Reinthal
View author publications
You can also search for this author in PubMed Google Scholar
Eleftherios Lef Filippakis
View author publications
You can also search for this author in PubMed Google Scholar
Magnus Almgren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Magnus Almgren .

Editor information

Editors and Affiliations

University of Oslo, Oslo, Norway
Nils Gruschka

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Reinthal, A., Filippakis, E.L., Almgren, M. (2018). Data Modelling for Predicting Exploits. In: Gruschka, N. (eds) Secure IT Systems. NordSec 2018. Lecture Notes in Computer Science(), vol 11252. Springer, Cham. https://doi.org/10.1007/978-3-030-03638-6_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-03638-6_21
Published: 02 November 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03637-9
Online ISBN: 978-3-030-03638-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics