Skip to main content

Extracting Event Data from Databases to Unleash Process Mining

  • Chapter
  • First Online:
BPM - Driving Innovation in a Digital World

Part of the book series: Management for Professionals ((MANAGPROF))

Abstract

Increasingly organizations are using process mining to understand the way that operational processes are executed. Process mining can be used to systematically drive innovation in a digitalized world. Next to the automated discovery of the real underlying process, there are process-mining techniques to analyze bottlenecks, to uncover hidden inefficiencies, to check compliance, to explain deviations, to predict performance, and to guide users towards “better” processes. Dozens (if not hundreds) of process-mining techniques are available and their value has been proven in many case studies. However, process mining stands or falls with the availability of event logs. Existing techniques assume that events are clearly defined and refer to precisely one case (i.e. process instance) and one activity (i.e., step in the process). Although there are systems that directly generate such event logs (e.g., BPM/WFM systems), most information systems do not record events explicitly. Cases and activities only exist implicitly. However, when creating or using process models “raw data” need to be linked to cases and activities. This paper uses a novel perspective to conceptualize a database view on event data. Starting from a class model and corresponding object models it is shown that events correspond to the creation, deletion, or modification of objects and relations. The key idea is that events leave footprints by changing the underlying database. Based on this an approach is described that scopes, binds, and classifies data to create “flat” event logs that can be analyzed using traditional process-mining techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We use the term “digitalize” to emphasize the transformational character of digitized data.

  2. 2.

    For example, http://www.win.tue.nl/ieeetfpm/doku.php?id=shared:process_mining_case_studies lists over 20 successful case studies in industry.

  3. 3.

    Increasingly systems mark deleted objects as not relevant (a so-called soft delete) rather than deleting them. In this way all intermediate states of the database can be reconstructed. Moreover, marking objects as deleted instead of completely removing them from the database is often more natural, e.g., concerts are not deleted—they are canceled, employees are not deleted—they are fired, etc.

  4. 4.

    \( \mathcal{P}(X) \) is the powerset of X, i.e., \( Y\in \mathcal{P}(X) \) if \( Y\subseteq X \).

  5. 5.

    \(f\;\in\;X\nrightarrow\;Y\) is a partial function, i.e., the domain of f may be any subset of X: \( dom(f)\subseteq X \).

References

  • Aalst, W. van der (2011). Process mining: Discovery, conformance and enhancement of business processes. Berlin: Springer.

    Google Scholar 

  • Aalst, W. van der (2013a). Business process management: A comprehensive survey. ISRN Software Engineering, 1–37. doi:10.1155/2013/507984

  • Aalst, W. van der (2013b). Process cubes: Slicing, dicing, rolling up and drilling down event data for process mining. In M. Song, M. Wynn, & J. Liu (Eds.), Asia Pacific Conference on Business Process Management (AP-BPM 2013) (Lecture Notes in Business Information Processing, Vol. 159, pp. 1–22). Berlin: Springer.

    Google Scholar 

  • Aalst, W. van der (2013c). Service mining: Using process mining to discover, check, and improve service behavior. IEEE Transactions on Services Computing, 6(4), 525–535.

    Google Scholar 

  • Aalst, W. van der (2014). Data scientist: The engineer of the future. In K. Mertins, F. Benaben, R. Poler, & J. Bourrieres (Eds.), Proceedings of the I-ESA Conference (Enterprise Interoperability, Vol. 6, pp. 13–28). Berlin: Springer.

    Google Scholar 

  • Aalst, W. van der, Adriansyah, A., & Dongen, B. van (2012). Replaying history on process models for conformance checking and performance analysis. WIREs Data Mining and Knowledge Discovery, 2(2), 182–192.

    Google Scholar 

  • Aalst, W. van der, Barthelmess, P., Ellis, C., & Wainer, J. (2001). Proclets: A framework for lightweight interacting workflow processes. International Journal of Cooperative Information Systems, 10(4), 443–482.

    Google Scholar 

  • Aalst, W. van der, Mooij, A., Stahl, C., & Wolf, K. (2009). Service interaction: Patterns, formalization, and analysis. In M. Bernardo, L. Padovani, & G. Zavattaro (Eds.), Formal methods for web services (Lecture Notes in Computer Science, Vol. 5569, pp. 42–88). Berlin: Springer.

    Google Scholar 

  • Aalst, W. van der, Rubin, V., Verbeek, H., Dongen, B. van, Kindler, E., & Günther, C. (2010). Process mining: A two-step approach to balance between underfitting and overfitting. Software and Systems Modeling, 9(1), 87–111.

    Google Scholar 

  • Aalst, W. van der, & Stahl, C. (2011). Modeling business processes: A petri net oriented approach. Cambridge, MA: MIT Press.

    Google Scholar 

  • Aalst, W. van der, Weijters, A., & Maruster, L. (2004). Workflow mining: Discovering process models from event logs. IEEE Transactions on Knowledge and Data Engineering, 16(9), 1128–1142.

    Google Scholar 

  • ACSI. (2013). Artifact-centric service interoperation (ACSI) project home page. Retrieved from www.acsi-project.eu

  • Adriansyah, A., Dongen, B. van, & Aalst, W. van der (2011a). Conformance checking using cost-based fitness analysis. In C. Chi & P. Johnson (Eds.), IEEE International Enterprise Computing Conference (EDOC 2011) (pp. 55–64). IEEE Computer Society Washington, DC, USA.

    Google Scholar 

  • Adriansyah, A., Dongen, B., & Aalst, W. van der (2011b). Towards robust conformance checking. In M. Muehlen & J. Su (Eds.), BPM 2010 Workshops, Proceedings of the Sixth Workshop on Business Process Intelligence (BPI2010) (Lecture Notes in Business Information Processing, Vol. 66, pp. 122–133). Berlin: Springer.

    Google Scholar 

  • Adriansyah, A., Sidorova, N., & Dongen, B. van (2011c). Cost-based fitness in conformance checking. In International Conference on Application of Concurrency to System Design (ACSD 2011) (pp. 57–66). IEEE Computer Society Washington, DC, USA.

    Google Scholar 

  • Agrawal, R., Gunopulos, D., & Leymann, F. (1998). Mining process models from workflow logs. In Sixth International Conference on Extending Database Technology (Lecture Notes in Computer Science, Vol. 1377, pp. 469–483). Berlin: Springer.

    Google Scholar 

  • Ana Karla Alves de Medeiros, Weijters, A., & Aalst, W. van der (2007). Genetic process mining: An experimental evaluation. Data Mining and Knowledge Discovery, 14(2), 245–304.

    Google Scholar 

  • Barros, A., Decker, G., Dumas, M., & Weber, F. (2007). Correlation patterns in service-oriented architectures. In M. Dwyer & A. Lopes (Eds.), Proceedings of the 10th International Conference on Fundamental Approaches to Software Engineering (FASE 2007) (Lecture Notes in Computer Science, Vol. 4422, pp. 245–259). Berlin: Springer.

    Chapter  Google Scholar 

  • Bergenthum, R., Desel, J., Lorenz, R., & Mauser, S. (2007). Process mining based on regions of languages. In G. Alonso, P. Dadam, & M. Rosemann (Eds.), International Conference on Business Process Management (BPM 2007) (Lecture Notes in Computer Science, Vol. 4714, pp. 375–383). Berlin: Springer.

    Google Scholar 

  • Brocke, J., & Rosemann, M. (Eds.). (2010). Handbook on business process management, international handbooks on information systems. Berlin: Springer.

    Google Scholar 

  • Calders, T., Guenther, C., Pechenizkiy, M., & Rozinat, A. (2009). Using minimum description length for process mining. In ACM Symposium on Applied Computing (SAC 2009) (pp. 1451–1455). New York, NY: ACM Press.

    Google Scholar 

  • Carmona, J., & Cortadella, J. (2010). Process mining meets abstract interpretation. In J. Balcazar (Ed.), ECML/PKDD 210 (Lecture Notes in Artificial Intelligence, Vol. 6321, pp. 184–199). Berlin: Springer.

    Google Scholar 

  • Carmona, J., Cortadella, J., & Kishinevsky, M. (2008). A region-based algorithm for discovering petri nets from event logs. In Business Process Management (BPM2008) (pp. 358–373). Berlin: Springer.

    Google Scholar 

  • Chaudhuri, S., & Dayal, U. (1997). An overview of data warehousing and OLAP technology. ACM Sigmod Record, 26(1), 65–74.

    Article  Google Scholar 

  • Chen, P. (1976). The entity-relationship model: Towards a unified view of data. ACM Transactions on Database Systems, 1, 9–36.

    Article  Google Scholar 

  • Cohn, D., & Hull, R. (2009). Business artifacts: A data-centric approach to modeling business operations and processes. IEEE Data Engineering Bulletin, 32(3), 3–9.

    Google Scholar 

  • Cook, J., & Wolf, A. (1998). Discovering models of software processes from event-based data. ACM Transactions on Software Engineering and Methodology, 7(3), 215–249.

    Article  Google Scholar 

  • Cook, J., & Wolf, A. (1999). Software process validation: Quantitatively measuring the correspondence of a process to a model. ACM Transactions on Software Engineering and Methodology, 8(2), 147–176.

    Article  Google Scholar 

  • Dumas, M., Marcello La Rosa, M., Mendling, J., & Reijers, H. (2013). Fundamentals of business process management. Berlin: Springer.

    Book  Google Scholar 

  • Fahland, D., Massimiliano de Leoni, Dongen, B. van, & Aalst, W. van der (2011a). Behavioral conformance of artifact-centric process models. In A. Abramowicz (Ed.), Business Information Systems (BIS 2011) (Lecture Notes in Business Information Processing, Vol. 87, pp. 37–49). Berlin: Springer.

    Google Scholar 

  • Fahland, D., Massimiliano de Leoni, Dongen, B. van, & Aalst, W. van der (2011b). Many-to-many: Some observations on interactions in artifact choreographies. In D. Eichhorn, A. Koschmider, & H. Zhang (Eds.), Proceedings of the 3rd Central-European Workshop on Services and their Composition (ZEUS 2011), CEUR-WS.org, CEUR Workshop Proceedings (pp. 9–15).

    Google Scholar 

  • Gaaloul, W., Gaaloul, K., Bhiri, S., Haller, A., & Hauswirth, M. (2009). Log-based transactional workflow mining. Distributed and Parallel Databases, 25(3), 193–240.

    Article  Google Scholar 

  • Goedertier, S., Martens, D., Vanthienen, J., & Baesens, B. (2009). Robust process discovery with artificial negative events. Journal of Machine Learning Research, 10, 1305–1340.

    Google Scholar 

  • Günther, C., & Aalst, W. van der (2006). A generic import framework for process event logs. In J. Eder & S. Dustdar (Eds.), Business Process Management Workshops, Workshop on Business Process Intelligence (BPI 2006) (Lecture Notes in Computer Science, Vol. 4103, pp. 81–92). Berlin: Springer.

    Google Scholar 

  • Hofstede, A. ter, Aalst, W. van der, Adams, M., & Russell, N. (2010). Modern business process automation: YAWL and its support environment. Berlin: Springer.

    Google Scholar 

  • IEEE Task Force on Process Mining. (2011). Process mining manifesto. In BPM Workshops (Lecture Notes in Business Information Processing, Vol. 99). Berlin: Springer.

    Google Scholar 

  • IEEE Task Force on Process Mining. (2013a). Process mining case studies. Retrieved from http://www.win.tue.nl/ieeetfpm/doku.php?id=shared:process_mining_case_studies

  • IEEE Task Force on Process Mining. (2013b). XES standard definition. Retrieved from www.xes-standard.org

  • Jagadeesh Chandra Bose, R.P., Mans, R., & Aalst, W. van der (2013). Wanna improve process mining results? It’s high time we consider data quality issues seriously. In B. Hammer, Z. Zhou, L. Wang, & N. Chawla (Eds.), IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2013) (pp. 127–134). Singapore: IEEE.

    Google Scholar 

  • Montahari-Nezhad, H., Saint-Paul, R., Casati, F., & Benatallah, B. (2011). Event correlation for process discovery from web service interaction logs. VLBD Journal, 20(3), 417–444.

    Google Scholar 

  • Munoz-Gama, J., & Carmona, J. (2010). A fresh look at precision in process conformance. In R. Hull, J. Mendling, & S. Tai (Eds.), Business Process Management (BPM 2010) (Lecture Notes in Computer Science, Vol. 6336, pp. 211–226). Berlin: Springer.

    Chapter  Google Scholar 

  • Munoz-Gama, J., & Carmona, J. (2011). Enhancing precision in process conformance: Stability, confidence and severity. In N. Chawla, I. King, & A. Sperduti (Eds.), IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2011) (pp. 184–191). Paris: IEEE.

    Google Scholar 

  • OMG. (2009). Unified modeling language, infrastructure and superstructure (Version 2.2, OMG final adopted specification). http://www.omg.org/spec/UML/2.2/

  • Pauw, W., Lei, M., Pring, E., Villard, L., Arnold, M., & Morar, J. (2005). Web services navigator: Visualizing the execution of web services. IBM Systems Journal, 44(4), 821–845.

    Article  Google Scholar 

  • Reichert, M., & Weber, B. (2012). Enabling flexibility in process-aware information systems: Challenges, methods, technologies. Berlin: Springer.

    Book  Google Scholar 

  • Rosa, M. La, Reijers, H., Aalst, W. van der, Dijkman, R., Mendling, J., Dumas, M., et al. (2011). APROMORE: An advanced process model repository. Expert Systems with Applications, 38(6), 7029–7040.

    Google Scholar 

  • Rozinat, A., & Aalst, W. van der (2008). Conformance checking of processes based on monitoring real behavior. Information Systems, 33(1), 64–95.

    Google Scholar 

  • Sole, M., & Carmona, J. (2010). Process mining from a basis of regions. In J. Lilius & W. Penczek (Eds.), Applications and Theory of Petri Nets 2010 (Lecture Notes in Computer Science, Vol. 6128, pp. 226–245). Berlin: Springer.

    Chapter  Google Scholar 

  • Verbeek, H., Buijs, J., Dongen, B. van, & Aalst, W. van der (2010). XES, XESame, and ProM 6. In P. Soffer & E. Proper (Eds.), Information systems evolution (Lecture Notes in Business Information Processing, Vol. 72, pp. 60–75). Berlin: Springer.

    Google Scholar 

  • Weerdt, J., De Backer, M., Vanthienen, J., & Baesens, B. (2011). A robust f-measure for evaluating discovered process models. In N. Chawla, I. King, & A. Sperduti (Eds.), IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2011) (pp. 148–155). Paris: IEEE.

    Google Scholar 

  • Weijters, A., & Aalst, W. van der (2003). Rediscovering workflow models from event-based data using little thumb. Integrated Computer-Aided Engineering, 10(2), 151–162.

    Google Scholar 

  • Werf, J., Dongen, B. van, Hurkens, C., & Serebrenik, A. (2010). Process discovery using integer linear programming. Fundamenta Informaticae, 94, 387–412.

    Google Scholar 

  • Weske, M. (2007). Business process management: Concepts, languages, architectures. Berlin: Springer.

    Google Scholar 

Download references

Acknowledgements

This work was supported by the Basic Research Program of the National Research University Higher School of Economics (HSE) in Moscow.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wil M. P. van der Aalst .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

van der Aalst, W.M.P. (2015). Extracting Event Data from Databases to Unleash Process Mining. In: vom Brocke, J., Schmiedel, T. (eds) BPM - Driving Innovation in a Digital World. Management for Professionals. Springer, Cham. https://doi.org/10.1007/978-3-319-14430-6_8

Download citation

Publish with us

Policies and ethics