LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak Supervision

Wittkopp, Thorsten; Wiesner, Philipp; Scheinert, Dominik; Acker, Alexander

doi:10.1007/978-3-030-91431-8_46

Thorsten Wittkopp¹³,
Philipp Wiesner¹³,
Dominik Scheinert¹³ &
…
Alexander Acker¹³

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 13121))

Included in the following conference series:

International Conference on Service-Oriented Computing

3074 Accesses
5 Citations

Abstract

With increasing scale and complexity of cloud operations, automated detection of anomalies in monitoring data such as logs will be an essential part of managing future IT infrastructures. However, many methods based on artificial intelligence, such as supervised deep learning models, require large amounts of labeled training data to perform well. In practice, this data is rarely available because labeling log data is expensive, time-consuming, and requires a deep understanding of the underlying system. We present LogLAB, a novel modeling approach for automated labeling of log messages without requiring manual work by experts. Our method relies on estimated failure time windows provided by monitoring systems to produce precise labeled datasets in retrospect. It is based on the attention mechanism and uses a custom objective function for weak supervision deep learning techniques that accounts for imbalanced data. Our evaluation shows that LogLAB consistently outperforms nine benchmark approaches across three different datasets and maintains an F1-score of more than 0.98 even at large failure time windows.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://github.com/dos-group/LogLAB.

References

Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: Burstein, J., Doran, C., Solorio, T. (eds.) NAACL-HLT. Association for Computational Linguistics (2019)
Google Scholar
Du, M., Li, F., Zheng, G., Srikumar, V.: Deeplog: anomaly detection and diagnosis from system logs through deep learning. In: SIGSAC (2017)
Google Scholar
Fusilier, D.H., Montes-y Gómez, M., Rosso, P., Cabrera, R.G.: Detecting positive and negative deceptive opinions using PU-learning. Inf. Process. Manag. 51, 433–443 (2015)
Article Google Scholar
Genkin, A., Lewis, D.D., Madigan, D.: Large-scale bayesian logistic regression for text categorization. Technometrics 49(3), 291–304 (2007)
Google Scholar
Gulenko, A., Acker, A., Kao, O., Liu, F.: Ai-governance and levels of automation for aiops-supported system administration. In: ICCCN. IEEE (2020)
Google Scholar
He, S., Zhu, J., He, P., Lyu, M.R.: Experience report: system log analysis for anomaly detection. In: ISSRE. IEEE (2016)
Google Scholar
Ho, T.K.: Random decision forests. In: ICDAR. IEEE (1995)
Google Scholar
Hosmer Jr., D.W., Lemeshow, S., Sturdivant, R.X.: Applied Logistic Regression, vol. 398. Wiley, Hoboken (2013)
Google Scholar
Jolliffe, I.: Principal component analysis. Encyclopedia of statistics in behavioral science (2005)
Google Scholar
Kowsari, K., Jafari Meimandi, K., Heidarysafa, M., Mendu, S., Barnes, L., Brown, D.: Text classification algorithms: a survey. Information 10(4), 150 (2019)
Article Google Scholar
Liu, B., Dai, Y., Li, X., Lee, W.S., Yu, P.S.: Building text classifiers using positive and unlabeled examples. In: ICDM. IEEE (2003)
Google Scholar
Liu, B., Lee, W.S., Yu, P.S., Li, X.: Partially supervised classification of text documents. In: ICML, Sydney, NSW (2002)
Google Scholar
Lou, J.G., Fu, Q., Yang, S., Xu, Y., Li, J.: Mining invariants from console logs for system problem detection. In: USENIX Annual Technical Conference (2010)
Google Scholar
Manevitz, L.M., Yousef, M.: One-class SVMs for document classification. J. Mach. Learn. Res. 2(Dec), 139–154 (2001)
Google Scholar
Mordelet, F., Vert, J.P.: A bagging SVM to learn from positive and unlabeled examples. Pattern Recognit. Lett. 37, 201–209 (2014)
Article Google Scholar
Nedelkoski, S., Bogatinovski, J., Acker, A., Cardoso, J., Kao, O.: Self-attentive classification-based anomaly detection in unstructured logs. In: ICDM (2020)
Google Scholar
Oliner, A., Stearley, J.: What supercomputers say: a study of five system logs. In: DSN (2007)
Google Scholar
Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)
Google Scholar
Ratner, A.J., De Sa, C.M., Wu, S., Selsam, D., Ré, C.: Data programming: creating large training sets, quickly. NIPS 29, 3567–3575 (2016)
Google Scholar
Schapire, R.E., Singer, Y.: Boostexter: a boosting-based system for text categorization. Mach. Learn. 39, 135–168 (2000)
Article Google Scholar
Selvi, S.T., Karthikeyan, P., Vincent, A., Abinaya, V., Neeraja, G., Deepika, R.: Text categorization using rocchio algorithm and random forest algorithm. In: ICoAC. IEEE (2017)
Google Scholar
Sowmya, B., Srinivasa, K., et al.: Large scale multi-label text classification of a hierarchical dataset using rocchio algorithm. In: CSITSS. IEEE (2016)
Google Scholar
Sukhwani, H., Matias, R., Trivedi, K.S., Rindos, A.: Monitoring and mitigating software aging on IBM cloud controller system. In: ISSREW. IEEE (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) NeurIPS (2017)
Google Scholar
Wittkopp, T., Acker, A., et al.: Decentralized federated learning preserves model and data privacy. In: Hacid, H. (ed.) ICSOC 2020. LNCS, vol. 12632, pp. 176–187. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-76352-7_20
Chapter Google Scholar
Wittkopp, T., et al.: A2log: attentive augmented log anomaly detection. In: HICSS (2022)
Google Scholar
Yang, L., et al.: Semi-supervised log-based anomaly detection via probabilistic label estimation. In: ICSE. IEEE (2021)
Google Scholar
Yang, R., Qu, D., Gao, Y., Qian, Y., Tang, Y.: NLSALog: an anomaly detection framework for log sequence in security management. IEEE Access 7, 181152–181164 (2019)
Article Google Scholar
Zhang, X., et al.: Robust log-based anomaly detection on unstable log data. In: ESEC/FSE (2019)
Google Scholar
Zhou, Z.H.: A brief introduction to weakly supervised learning. Natl. Sci. Rev. 5(1), 44–53 (2018)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Technische Universität Berlin, DOS, TU-Berlin, Berlin, Germany
Thorsten Wittkopp, Philipp Wiesner, Dominik Scheinert & Alexander Acker

Authors

Thorsten Wittkopp
View author publications
You can also search for this author in PubMed Google Scholar
Philipp Wiesner
View author publications
You can also search for this author in PubMed Google Scholar
Dominik Scheinert
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Acker
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Thorsten Wittkopp , Philipp Wiesner , Dominik Scheinert or Alexander Acker .

Editor information

Editors and Affiliations

Zayed University, Dubai, United Arab Emirates
Hakim Hacid
Technical University of Berlin, Berlin, Germany
Odej Kao
Informatica Automatica Gestio, Sapienza University of Rome, Rome, Italy
Massimo Mecella
Departement d'Informatique, University of Quebec, Montreal, QC, Canada
Naouel Moha
UNSW Sydney, Sydney, NSW, Australia
Hye-young Paik

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wittkopp, T., Wiesner, P., Scheinert, D., Acker, A. (2021). LogLAB: Attention-Based Labeling of Log Data Anomalies via Weak Supervision. In: Hacid, H., Kao, O., Mecella, M., Moha, N., Paik, Hy. (eds) Service-Oriented Computing. ICSOC 2021. Lecture Notes in Computer Science(), vol 13121. Springer, Cham. https://doi.org/10.1007/978-3-030-91431-8_46

Download citation

DOI: https://doi.org/10.1007/978-3-030-91431-8_46
Published: 18 November 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-91430-1
Online ISBN: 978-3-030-91431-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics