Skip to main content

BP-SPARQL: A Query Language for Summarizing and Analyzing Big Process Data

  • Chapter
  • First Online:
Process Querying Methods

Abstract

In modern enterprises, business processes (BPs) are realized over a mix of workflows, IT systems, Web services, and direct collaborations of people. Accordingly, process data (i.e., BP execution data such as logs containing events, interaction messages, and other process artifacts) are scattered across several systems and data sources and increasingly show all typical properties of the Big Data. Understanding the execution of process data is challenging as key business insights remain hidden in the interactions among process entities: most objects are interconnected, forming complex heterogeneous but often semi-structured networks. In the context of business processes, we consider the Big data problem as a massive number of interconnected data islands from personal, shared, and business data. We present a framework to model process data as graphs, i.e., process graph, and present abstractions to summarize the process graph and to discover concept hierarchies for entities based on both data objects and their interactions in process graphs. We present a language, namely BP-SPARQL, for the explorative querying and understanding of process graphs from various user perspectives. We have implemented a scalable architecture for querying, exploration, and analysis of process graphs. We report on experiments performed on both synthetic and real-world datasets that show the viability and efficiency of the approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.00
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aggarwal, C., Wang, H.: Managing and Mining Graph Data. Springer Publishing Company (2010)

    Google Scholar 

  2. Amouzgar, F., Beheshti, A., Ghodratnama, S., Benatallah, B., Yang, J., Sheng, Q.Z.: iSheets: A spreadsheet-based machine learning development platform for data-driven process analytics. In: Service-Oriented Computing - ICSOC 2018 Workshops - ADMS, ASOCA, ISYyCC, CloTS, DDBS, and NLS4IoT, Hangzhou, China, November 12–15, 2018, Revised Selected Papers, pp. 453–457 (2018)

    Google Scholar 

  3. Anyanwu, K., Maduko, A., Sheth, A.P.: SPARQ2L: towards support for subgraph extraction queries in RDF databases. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, May 8–12, 2007, pp. 797–806 (2007)

    Google Scholar 

  4. Barros, A.P., Decker, G., Dumas, M., Weber, F.: Correlation patterns in service-oriented architectures. In: FASE, pp. 245–259 (2007)

    Google Scholar 

  5. Beheshti, S., Benatallah, B., Nezhad, H.R.M., Sakr, S.: A query language for analyzing business processes execution. In: Business Process Management - 9th International Conference, BPM 2011, Clermont-Ferrand, France, August 30 - September 2, 2011. Proceedings, pp. 281–297 (2011)

    Google Scholar 

  6. Beheshti, S., Benatallah, B., Nezhad, H.R.M., Allahbakhsh, M.: A framework and a language for on-line analytical processing on graphs. In: Web Information Systems Engineering - WISE 2012 - 13th International Conference, Paphos, Cyprus, November 28–30, 2012. Proceedings, pp. 213–227 (2012)

    Google Scholar 

  7. Beheshti, S., Nezhad, H.R.M., Benatallah, B.: Temporal provenance model (TPM): model and query language. CoRR abs/1211.5009 (2012)

    Google Scholar 

  8. Beheshti, S., Benatallah, B., Nezhad, H.R.M.: Enabling the analysis of cross-cutting aspects in ad-hoc processes. In: Advanced Information Systems Engineering - 25th International Conference, CAiSE 2013, Valencia, Spain, June 17–21, 2013. Proceedings, pp. 51–67 (2013)

    Google Scholar 

  9. Beheshti, S., Benatallah, B., Motahari-Nezhad, H.R.: Galaxy: A platform for explorative analysis of open data sources. In: Proceedings of the 19th International Conference on Extending Database Technology, EDBT 2016, Bordeaux, France, March 15–16, 2016, Bordeaux, France, March 15–16, 2016., pp. 640–643 (2016)

    Google Scholar 

  10. Beheshti, S., Benatallah, B., Motahari-Nezhad, H.R.: Scalable graph-based OLAP analytics over process execution data. Distrib. Parallel Databases 34(3), 379–423 (2016)

    Article  Google Scholar 

  11. Beheshti, S., Benatallah, B., Sakr, S., Grigori, D., Motahari-Nezhad, H.R., Barukh, M.C., Gater, A., Ryu, S.H.: Process Analytics - Concepts and Techniques for Querying and Analyzing Process Data. Springer (2016)

    Google Scholar 

  12. Beheshti, A., Benatallah, B., Nouri, R., Chhieng, V.M., Xiong, H., Zhao, X.: CoreDB: a data lake service. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06–10, 2017, pp. 2451–2454 (2017)

    Google Scholar 

  13. Beheshti, A., Benatallah, B., Motahari-Nezhad, H.R.: Processatlas: A scalable and extensible platform for business process analytics. Softw. Pract. Exper. 48(4), 842–866 (2018)

    Article  Google Scholar 

  14. Beheshti, A., Benatallah, B., Nouri, R., Tabebordbar, A.: CoreKG: A knowledge lake service. PVLDB 11(12), 1942–1945 (2018)

    Google Scholar 

  15. Bhattacharya, K., Gerede, C.E., Hull, R., Liu, R., Su, J.: Towards formal analysis of artifact-centric business process models. In: BPM, pp. 288–304 (2007)

    Google Scholar 

  16. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  17. Gerede, C., Su, J.: Specification and verification of artifact behaviors in business process models. In: ICSOC, pp. 181–192 (2007)

    Google Scholar 

  18. Kochut, K., Janik, M.: SPARQLeR: Extended SPARQL for semantic association discovery. In: The Semantic Web: Research and Applications, 4th European Semantic Web Conference, ESWC 2007, Innsbruck, Austria, June 3–7, 2007, Proceedings, pp. 145–159 (2007)

    Google Scholar 

  19. Kuo, J.: A document-driven agent-based approach for business processes management. Inf. Softw. Technol. 46(6), 373–382 (2004)

    Article  Google Scholar 

  20. McAfee, A., Brynjolfsson, E., Davenport, T.H., Patil, D., Barton, D.: Big data: the management revolution. Harv. Bus. Rev. 90(10), 60–68 (2012)

    Google Scholar 

  21. Motahari-Nezhad, H., Saint-Paul, R., Casati, F., Benatallah, B.: Event correlation for process discovery from Web service interaction logs. VLDB J. 20(3), 417–444 (2011)

    Article  Google Scholar 

  22. Olston, C., Reed, B., Srivastava, U., Kumar, R., Tomkins, A.: Pig Latin: a not-so-foreign language for data processing. In: SIGMOD, pp. 1099–1110 (2008)

    Google Scholar 

  23. Polyvyanyy, A., Ouyang, C., Barros, A., van der Aalst, W.M.P.: Process querying: Enabling business intelligence through query-based process analytics. Decis. Support Syst. 100, 41–56 (2017)

    Article  Google Scholar 

  24. Prud’hommeaux, E., Seaborne, A.: SPARQL query language for RDF (working draft). Tech. rep., W3C (2007)

    Google Scholar 

  25. Schätzle, A., Przyjaciel-Zablocki, M., Lausen, G.: PigSPARQL: mapping SPARQL to Pig Latin. In: Proceedings of the International Workshop on Semantic Web Information Management (2011)

    Google Scholar 

  26. Sun, Y., Su, J., Yang, J.: Universal artifacts: A new approach to business process management (BPM) systems. ACM Trans. Manag. Inf. Syst. 7(1), 3:1–3:26 (2016)

    Google Scholar 

  27. van der Aalst, W., ter Hofstede, A.H.M., Weske, M.: Business process management: A survey. In: BPM (2003)

    Google Scholar 

  28. White, T.: Hadoop: The Definitive Guide, original edn. O’Reilly Media (2009)

    Google Scholar 

  29. Yu, J.X., Cheng, J.: Graph reachability queries: A survey. In: Managing and Mining Graph Data, pp. 181–215. Springer (2010)

    Google Scholar 

  30. Zikopoulos, P., Eaton, C., et al.: Understanding Big data: Analytics for enterprise class Hadoop and streaming data. McGraw-Hill Osborne Media (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amin Beheshti .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Beheshti, A., Benatallah, B., Motahari-Nezhad, H.R., Ghodratnama, S., Amouzgar, F. (2022). BP-SPARQL: A Query Language for Summarizing and Analyzing Big Process Data. In: Polyvyanyy, A. (eds) Process Querying Methods. Springer, Cham. https://doi.org/10.1007/978-3-030-92875-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-92875-9_2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-92874-2

  • Online ISBN: 978-3-030-92875-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics