An Interactive Web-Based Toolset for Knowledge Discovery from Short Text Log Data

Stewart, Michael; Liu, Wei; Cardell-Oliver, Rachell; Griffin, Mark

doi:10.1007/978-3-319-69179-4_61

Michael Stewart¹⁸,
Wei Liu¹⁸,
Rachell Cardell-Oliver¹⁸ &
…
Mark Griffin¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10604))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

3162 Accesses
4 Citations

Abstract

Many companies maintain human-written logs to capture data on events such as workplace incidents and equipment failures. However, the sheer volume and unstructured nature of this data prevent it from being utilised for knowledge acquisition. Our web-based prototype software system provides a cohesive computational methodology for analysing and visualising log data that requires minimal human involvement. It features an interface to support customisable, modularised log data processing and knowledge discovery. This enables owners of event-based datasets containing short textual descriptions, such as occupational health & safety officers and machine operators, to identify latent knowledge not previously acquirable without significant time and effort. The software system comprises five distinct stages, corresponding to standard data mining milestones: exploratory analysis, data warehousing, association rule mining, entity clustering, and predictive analysis. To the best of our knowledge, it is the first dedicated system to computationally analyse short text log data and provides a powerful interface that visualises the analytical results and supports human interaction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
NLTK. http://www.nltk.org/.
2.
Apache UIMA Project. https://uima.apache.org/.
3.
GATE. https://gate.ac.uk/.
4.
Nectar. https://nectar.org.au/research-cloud/.
5.
NLTK. http://www.nltk.org/.
6.
D3.js. https://d3js.org/.
7.
jsTree. https://www.jstree.com/.
8.
D3 Cloud. https://github.com/jasondavies/d3-cloud.
9.
Treant-js. https://github.com/fperucic/treant-js.

References

Baldwin, T., Kim, Y.B., de Marneffe, M.C., Ritter, A., Han, B., Xu, W.: Shared tasks of the 2015 workshop on noisy user-generated text: twitter lexical normalization and named entity recognition. In: ACL-IJCNLP 2015, vol. 126 (2015)
Google Scholar
Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, New York (2011)
MATH Google Scholar
Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM Sigmod Rec. 22, 207–216 (1993). ACM
Article Google Scholar
Sproat, R., Black, A.W., Chen, S., Kumar, S., Ostendorf, M., Richards, C.: Normalization of non-standard words. Comput. Speech Lang. 15(3), 287–333 (2001)
Article Google Scholar

Download references

Acknowledgements

This research was funded by an Australian Postgraduate Award Scholarship and a UWA Safety Net Top-up Scholarship.

Author information

Authors and Affiliations

School of Computer Science and Software Engineering, The University of Western Australia, Perth, Australia
Michael Stewart, Wei Liu & Rachell Cardell-Oliver
School of Management and Organisations, The University of Western Australia, Perth, Australia
Mark Griffin

Authors

Michael Stewart
View author publications
You can also search for this author in PubMed Google Scholar
Wei Liu
View author publications
You can also search for this author in PubMed Google Scholar
Rachell Cardell-Oliver
View author publications
You can also search for this author in PubMed Google Scholar
Mark Griffin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Michael Stewart or Wei Liu .

Editor information

Editors and Affiliations

Nanyang Technological University, Singapore, Singapore
Gao Cong
National Chiao Tung University, Hsinchu, Taiwan
Wen-Chih Peng
Macquarie University, Sydney, New South Wales, Australia
Wei Emma Zhang
Wuhan University, Wuhan, China
Chengliang Li
Nanyang Technological University, Singapore, Singapore
Aixin Sun

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stewart, M., Liu, W., Cardell-Oliver, R., Griffin, M. (2017). An Interactive Web-Based Toolset for Knowledge Discovery from Short Text Log Data. In: Cong, G., Peng, WC., Zhang, W., Li, C., Sun, A. (eds) Advanced Data Mining and Applications. ADMA 2017. Lecture Notes in Computer Science(), vol 10604. Springer, Cham. https://doi.org/10.1007/978-3-319-69179-4_61

Download citation

DOI: https://doi.org/10.1007/978-3-319-69179-4_61
Published: 14 October 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-69178-7
Online ISBN: 978-3-319-69179-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics