Skip to main content

An Incremental Learning Method to Support the Annotation of Workflows with Data-to-Data Relations

  • Conference paper
  • First Online:
Knowledge Engineering and Knowledge Management (EKAW 2016)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10024))

Included in the following conference series:

Abstract

Workflow formalisations are often focused on the representation of a process with the primary objective to support execution. However, there are scenarios where what needs to be represented is the effect of the process on the data artefacts involved, for example when reasoning over the corresponding data policies. This can be achieved by annotating the workflow with the semantic relations that occur between these data artefacts. However, manually producing such annotations is difficult and time consuming. In this paper we introduce a method based on recommendations to support users in this task. Our approach is centred on an incremental rule association mining technique that allows to compensate the cold start problem due to the lack of a training set of annotated workflows. We discuss the implementation of a tool relying on this approach and how its application on an existing repository of workflows effectively enable the generation of such annotations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    My experiment: http://www.myexperiment.org/.

  2. 2.

    W3C PROV: https://www.w3.org/TR/prov-overview/.

  3. 3.

    OPMW: http://www.opmw.org/.

  4. 4.

    PWO: http://purl.org/spar/pwo.

  5. 5.

    Wings: http://www.wings-workflows.org/.

  6. 6.

    My experiments: http://www.myexperiment.org/.

  7. 7.

    SHIWA: http://www.shiwa-workflow.eu/wiki/-/wiki/Main/SHIWA+Repository.

  8. 8.

    SCUFL2: https://taverna.incubator.apache.org/documentation/scufl2/.

  9. 9.

    Datanode: http://purl.org/datanode/ns/.

  10. 10.

    In this paper we use the terminology of the SCUFL2 specification. However, the basic structure is a common one. In the W3C PROV-O model this concept maps to the class Activity, in PWO with Step, and in OPMW to WorkflowExecutionProcess, just to mention few examples.

  11. 11.

    “LipidMaps Query” workflow from My experiment: http://www.myexperiment.org/workflows/1052.html.

  12. 12.

    Dinowolf: http://github.com/enridaga/dinowolf.

  13. 13.

    SCUFL2 Specification: https://taverna.incubator.apache.org/documentation/scufl2/.

  14. 14.

    Apache Taverna: https://taverna.incubator.apache.org/.

  15. 15.

    Apache Lucene: https://lucene.apache.org/core/.

  16. 16.

    DBPedia Spotlight: http://spotlight.dbpedia.org/.

  17. 17.

    DBPedia: http://dbpedia.org/.

  18. 18.

    My Experiments: http://www.myexperiments.org.

References

  1. Alper, P., Belhajjame, K., Goble, C.A., Karagoz, P.: LabelFlow: exploiting workflow provenance to surface scientific data provenance. In: Ludäscher, B., Plale, B. (eds.) IPAW 2014. LNCS, vol. 8628, pp. 84–96. Springer, Heidelberg (2015). doi:10.1007/978-3-319-16462-5_7

    Chapter  Google Scholar 

  2. Belhajjame, K., Corcho, O., Garijo, D., Zhao, J., Missier, P., Newman, D., Bechhofer, S., Garc a Cuesta, E., Soiland-Reyes, S., Verdes-Montenegro, L., et al.: Workflow-centric research objects: first class citizens in scholarly discourse. In: Proceedings of Workshop on the Semantic Publishing (SePublica 2012) 9th Extended Semantic Web Conference Hersonissos, Crete, Greece, 28 May 2012 (2012)

    Google Scholar 

  3. Belhajjame, K., Zhao, J., Garijo, D., Garrido, A., Soiland-Reyes, S., Alper, P., Corcho, O.: A workflow prov-corpus based on taverna and wings. In: Proceedings of the Joint EDBT/ICDT 2013 Workshops, pp. 331–332. ACM (2013)

    Google Scholar 

  4. Daga, E., d’Aquin, M., Adamou, A., Motta, E.: Addressing exploitability of smart city data. In: 2016 IEEE Second International Smart Cities Conference (ISC2). IEEE (2016)

    Google Scholar 

  5. Daga, E., d’Aquin, M., Gangemi, A., Motta, E.: Describing semantic web applications through relations between data nodes. Technical report kmi-14-05, Knowledge Media Institute, The Open University, Walton Hall, Milton Keynes (2014). http://kmi.open.ac.uk/publications/techreport/kmi-14-05

  6. Daga, E., d’Aquin, M., Gangemi, A., Motta, E.: Propagation of policies in rich data flows. In: Proceedings of the 8th International Conference on Knowledge Capture, K-CAP 2015, New York, NY, USA, pp. 5:1–5:8 (2015). http://doi.acm.org/10.1145/2815833.2815839

  7. Di Francescomarino, C., Ghidini, C., Rospocher, M., Serafini, L., Tonella, P.: Semantically-aided business process modeling. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 114–129. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  8. Ferreira, D.R., Alves, S., Thom, L.H.: Ontology-based discovery of workflow activity patterns. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 100, pp. 314–325. Springer, Heidelberg (2012). doi:10.1007/978-3-642-28115-0_30

    Chapter  Google Scholar 

  9. Gangemi, A., Peroni, S., Shotton, D., Vitali, F.: A pattern-based ontology for describing publishing workflows. In: Proceedings of the 5th International Conference on Ontology and Semantic Web Patterns, WOP 2014, vol. 1302, Aachen, Germany, pp. 2–13. CEUR-WS.org (2014). http://dl.acm.org/citation.cfm?id=2878937.2878939

  10. Garijo, D., Alper, P., Belhajjame, K., Corcho, O., Gil, Y., Goble, C.: Common motifs in scientific workflows: an empirical analysis. Future Gener. Comput. Syst. 36, 338–351 (2014)

    Article  Google Scholar 

  11. Garijo, D., Gil, Y.: A new approach for publishing workflows: abstractions, standards, and linked data. In: Proceedings of the 6th Workshop on Workflows in Support of Large-scale Science, WORKS 2011, NY, USA, pp. 47–56 (2011). http://doi.acm.org/10.1145/2110497.2110504

  12. Godin, R., Missaoui, R., Alaoui, H.: Incremental concept formation algorithms based on galois (concept) lattices. Comput. Intell. 11(2), 246–267 (1995)

    Article  Google Scholar 

  13. Gómez-Pérez, J.M., Corcho, O.: Problem-solving methods for understanding process executions. Comput. Sci. Eng. 10(3), 47–52 (2008)

    Article  Google Scholar 

  14. Hettne, K., Soiland-Reyes, S., Klyne, G., Belhajjame, K., Gamble, M., Bechhofer, S., Roos, M., Corcho, O.: Workflow forever: Semantic web semantic models and tools for preserving and digitally publishing computational experiments. In: Proceedings of the 4th International Workshop on Semantic Web Applications and Tools for the Life Sciences, SWAT4LS 2011, NY, USA, pp. 36–37 (2012). http://doi.acm.org/10.1145/2166896.2166909

  15. Kuznetsov, S.O., Obiedkov, S.A.: Comparing performance of algorithms for generating concept lattices. J. Exp. Theor. Artif. Intell. 14(2–3), 189–216 (2002)

    Article  MATH  Google Scholar 

  16. Liu, J., Pacitti, E., Valduriez, P., Mattoso, M.: A survey of data-intensive scientific workflow management. J. Grid Comput. 13(4), 457–493 (2015)

    Article  Google Scholar 

  17. Palma, R., Corcho, O., Hotubowicz, P., Pérez, S., Page, K., Mazurek, C.: Digital libraries for the preservation of research methods and associated artifacts. In: Proceedings of the 1st International Workshop on Digital Preservation of Research Methods and Artefacts, DPRMA 2013, NY, USA, pp. 8–15 (2013). http://doi.acm.org/10.1145/2499583.2499589

  18. Poelmans, J., Elzinga, P., Viaene, S., Dedene, G.: Formal concept analysis in knowledge discovery: a survey. In: Croitoru, M., Ferré, S., Lukose, D. (eds.) ICCS 2010. LNCS (LNAI), vol. 6208, pp. 139–153. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14197-3_15

    Chapter  Google Scholar 

  19. Poelmans, J., Kuznetsov, S.O., Ignatov, D.I., Dedene, G.: Formal concept analysis in knowledge processing: a survey on models and techniques. Expert Syst. Appl. 40(16), 6601–6623 (2013)

    Article  Google Scholar 

  20. Weber, I., Hoffmann, J., Mendling, J.: Semantic business process validation. In: Proceedings of the 3rd International Workshop on Semantic Business Process Management (SBPM 2008). CEUR-WS Proceedings, vol. 472 (2008)

    Google Scholar 

  21. Wille, R.: Formal concept analysis as mathematical theory of concepts and concept hierarchies. In: Ganter, B., Stumme, G., Wille, R. (eds.) Formal Concept Analysis. LNCS (LNAI), vol. 3626, pp. 1–33. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  22. Wolstencroft, K., Haines, R., Fellows, D., Williams, A., Withers, D., Owen, S., Soiland-Reyes, S., Dunlop, I., Nenadic, A., Fisher, P., et al.: The taverna workflow suite: designing and executing workflows of web services on the desktop, web or in the cloud. Nucleic Acids Res. 41, W557–W561 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Enrico Daga .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Daga, E., d’Aquin, M., Gangemi, A., Motta, E. (2016). An Incremental Learning Method to Support the Annotation of Workflows with Data-to-Data Relations. In: Blomqvist, E., Ciancarini, P., Poggi, F., Vitali, F. (eds) Knowledge Engineering and Knowledge Management. EKAW 2016. Lecture Notes in Computer Science(), vol 10024. Springer, Cham. https://doi.org/10.1007/978-3-319-49004-5_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-49004-5_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-49003-8

  • Online ISBN: 978-3-319-49004-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics