Modeling and Reasoning over Data Licenses

Panasiuk, Oleksandra; Steyskal, Simon; Havur, Giray; Fensel, Anna; Kirrane, Sabrina

doi:10.1007/978-3-319-98192-5_41

Oleksandra Panasiuk²⁶,
Simon Steyskal^27,28,
Giray Havur^27,28,
Anna Fensel²⁶ &
…
Sabrina Kirrane²⁷

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11155))

Included in the following conference series:

European Semantic Web Conference

2039 Accesses
7 Citations

Abstract

In this paper, we propose an extension of the Open Digital Right Language for modeling well-known licenses and propose an approach to automatically check license compatibility.

You have full access to this open access chapter, Download conference paper PDF

These Are Your Rights

Modelling the Compatibility of Licenses

Automated Rights Clearance Using Semantic Web Technologies: The DALICC Framework

1 Introduction

Copyright is a legal right, under Intellectual Property law, that enables creators of artistic works to specify how their work is used and distributed. When it comes to information available on the Internet there is often a misconception that public information can be freely copied and downloaded, however creative works available online are also protected via copyright, irrespective of whether a license is present or not. In order to support the automatic checking of licenses, it is necessary to model licenses in a manner such that it is possible to automatically verify if it is permissible to combine and reuse different datasets or software libraries. When it comes to machine readable licenses, there have been a number of Rights Expression Languages standardisation initiatives (e.g. the Open Digital Rights Language(ODRL)^{Footnote 1} and the Creative Commons Rights Expression Language (ccREL)^{Footnote 2}). In addition, there have been a number of works that demonstrate how RDF can be used to represent and reason over licenses [2, 4, 7]. In this paper, we describe work conducted by the Data Licenses Clearance Center (DALICC) project^{Footnote 3}, which focuses on extending existing vocabularies to enable modeling and reasoning over well-known license texts. Herein we make the following contributions: (i) we extend ODRL so that it can be used to model several standard license families (CC, BSD, MIT, BSD, GPL); and we propose a system to automatically check license compatibility.

2 Related Work

Rights Expression Languages (RELs) are used to explicate machine-readable rights for purposes of Digital Asset Management. Among the most prominent REL vocabularies are ccREL (which is a W3C member submission) and ODRL (a W3C recommendation from February 2018), and a derivative RightsML^{Footnote 4}. Besides standardization other work includes: an OWL ontology that can be used to describe the copyright domain [2], a framework for adding licensing terms to web data [7] and a license composition tool for derivative works [4].

3 License Modeling

An example output from our modeling process, which is comprised of three parts: (i) analysis of license text; (ii) defining vocabularies to express licenses; and (iii) deriving modeling and mapping mechanisms, can be seen in Listing 1.1.

Analysis of the License Text Representation. For our analysis we selected 14 commonly used licenses, namely: CC BY, CC BY-SA, CC BY-NC, CC BY-ND, CC BY-NC-ND, CC BY-NC-SA, APACHE, BSD-2, BSD-3, GNU GPL-2, GNU GPL-3, APGL, LGPL and MIT, which can be applied to the different assets, such as creative works, software and datasets. From the text representation we identified important concepts, requirements, and conflicts between licenses.

Defining Vocabularies. Based on research conducted on the genealogy of RELs [5] we chose ODRL as it is particularly suitable for modeling licenses in the form of policies. The policy expresses permissions, prohibitions and duties related to the usage of assets (e.g. actions odrl:reproduce, odrl:distribute can be applied to the target “Image”). To represent the main asset targets we used the Dublin Core vocabulary^{Footnote 5}, which covers such concepts as: software, dataset, sound, text and image. Furthermore, the ODRL vocabulary includes terms that are depreciated by terms from ccREL (e.g. odrl:commercialize by cc:CommercialUse) or are supplemented by terms from ccREL (e.g. cc:Notice to capture copyright information). However, given that together the ODRL and ccREL vocabularies are not able to represent all of the necessary license concepts, we constructed a DALICC vocabulary^{Footnote 6} in order to fill this gap (e.g. dalicc:perpetual as a validity period of the license, dalicc:worldwide as a jurisdictional property, dalicc:modificationNotice as an action to state changes, see in Listing 1.1).

Modeling and Mapping Mechanisms. When it comes to modeling licenses, we use provenance to model information about assets (e.g. odrl:target dct:Software) and additional information about the license (e.g.cc:jurisdiction dalicc:worldwide) and ODRL rules to represent common licensing conditions divided into three categories: permissions, duties and prohibitions. An RDF representation of the APACHE 2.0 license^{Footnote 7} is shown Listing 1.1. The license permits redistribution, reproduction, modification, public presentation of the asset, commercial use, charging a distribution fee, creation of a new derivative, distribution and changing the license for a derivative work, but prohibits the charge of a licensing fee. The license requires the user to post a notice of the type of license, to give attribution to the creator and to state changes.

4 Verifying License Compatibility

The license compatibility check is performed by a reasoning engine, which uses Answer Set Programming (ASP) [1], a declarative knowledge representation and reasoning formalism that is supported by a wide range of efficient solvers. An ASP program consists of rules: \(Head \leftarrow A_1,...,A_m,not~A_{m+1},...,not~A_{n}\) where \(m,n\ge {0}\), Head and each \(A_i\) are atoms. A rule is called a fact if \(m=n=0\). Sets of rules are evaluated in ASP under the stable-model semantics which allows several models, i.e. “answer sets” [1]. We use the clingo [3] ASP solver for our experiments, as it is one of the most efficient implementations available.

Licences should be understood as a set of rules derived from the RDF graphs of the licenses. Herein, a rule that permits or prohibits the execution of an action on certain assets does not only affect other rules that govern the execution of the same action on the same asset(s) but also those permitting or prohibiting related actions on the same asset(s). DALICC utilises a dependency graph for representing the semantic relationship between defined actions (cf., Listing 1.2). The function of this graph is to encode expert knowledge on the implicit and explicit dependencies between actions. Following the work of Steyskal and Polleres [6], the corresponding dependency graph represents hierarchical relationships (e.g., present includes display), implications derived from a specific action (e.g., share implies distribute), equalities (e.g., copy equals reproduce), and contradictions between specific actions (e.g., non-derivative contradicts derivative).

In order to verify license compatibility, the RDF representation of the licenses are first translated into an ASP program as follows: (i) \({{{\mathbf {\mathtt{{{\small {rule(}}}}}}}l,c,i,\alpha ,t{{{\mathbf {\mathtt{{\small {)}}}}}}}}\), a rule in a licence l of category c (i.e. permission, prohibition or duty) is granted to an assignee i for executing an action \(\alpha \) on the asset t; (ii) \({{{\mathbf {\mathtt{{{\small {action(}}}}}}}\alpha {{{\mathbf {\mathtt{{\small {)}}}}}}}}\), \(\alpha \) is an action; (iii) \({{{\mathbf {\mathtt{{{\small {sameAs(}}}}}}}\alpha _1,\alpha _2{{{\mathbf {\mathtt{{\small {)}}}}}}}}\), \(\alpha _1\) and \(\alpha _2\) are the same action; (iv) \({{{\mathbf {\mathtt{{{\small {includedIn(}}}}}}}\alpha _1,\alpha _2{{{\mathbf {\mathtt{{\small {)}}}}}}}}\), action \(\alpha _1\) is included in action \(\alpha _2\); (v) \({{{\mathbf {\mathtt{{{\small {implies(}}}}}}}\alpha _1,\alpha _2{{{\mathbf {\mathtt{{\small {)}}}}}}}}\), action \(\alpha _1\) implies action \(\alpha _2\).

Our ASP program returns an answer set that consists of the predicate \({{{\mathbf {\mathtt{{{\small {conflict(}}}}}}}rule_1(l_1,c_1,i_1,{\alpha }_1,t_1), rule_2(l_2,c_2,i_2,{\alpha }_2,t_2){{{\mathbf {\mathtt{{\small {)}}}}}}}}\) which means \(rule_1\) is in conflict with \(rule_2\) (i.e., \(l_1\) does not comply with \(l_2\)). In ODRL, if an action \(\alpha _1\) is included in or equal to another action \(\alpha _2\) (\(\alpha _1\) odrl:includedIn|owl:sameAs \(\alpha _2\)), all the rules defined for \(\alpha _2\) must also hold for \(\alpha _1\) and vice versa. Moreover, if an action \(\alpha _1\) implies another action \(\alpha _2\) (\(\alpha _1\) odrl:implies \(\alpha _2\)), a prohibition of \(\alpha _2\) conflicts a permission of \(\alpha _1\) (but not necessarily vice versa).

An extended version of this program is – given multiple licenses as input – capable of finding all non conflicting sets of permissions, prohibitions, and duties of those licenses. These reasoning functionalities are accessed via an UI in a web service.

5 Conclusion

In this paper, we discussed how well-know licenses can be modeled using ODRL. We analyzed 14 licenses in total and extended existing vocabularies so that we can both model and check the compatibility of licenses automatically.

Notes

References

Brewka, G., Eiter, T., Truszczyński, M.: Answer set programming at a glance. Commun. ACM 54(12), 92–103 (2011)
Article Google Scholar
García, R., Gil, R.: Copyright licenses reasoning using an OWL-DL ontology. Law Ontol. Semant. Web Channelling Legal Inf. Flood 188, 145–162 (2009)
Google Scholar
Gebser, M., Kaminski, R., Kaufmann, B., Schaub, T.: Clingo = asp+ control: extended report (2014)
Google Scholar
Governatori, G., Lam, H.-P., Rotolo, A., Villata, S., Atemezing, G.A., Gandon, F.L.: Live: a tool for checking licenses compatibility between vocabularies and data. In International Semantic Web Conference (2014)
Google Scholar
Pellegrini, T., et al.: A genealogy and classification of rights expression languages - preliminary results. In: Data Protection/LegalTech - Proceedings of the 21st International Legal Informatics Symposium IRIS 2018, pp. 243–250 (2018)
Google Scholar
Steyskal, S., Polleres, A.: Towards formal semantics for ODRL policies. In: Bassiliades, N., Gottlob, G., Sadri, F., Paschke, A., Roman, D. (eds.) RuleML 2015. LNCS, vol. 9202, pp. 360–375. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-21542-6_23
Chapter Google Scholar
Villata, S., Gandon, S.: Licenses compatibility and composition in the web of data. In: Third International Workshop on Consuming Linked Data (2012)
Google Scholar

Download references

Acknowledgments

Funded by the Austrian Federal Ministry of Transport, Innovation and Technology (BMVIT) DALICC project https://www.dalicc.net.

Author information

Authors and Affiliations

STI Innsbruck, Department of Computer Science, University of Innsbruck, Innsbruck, Austria
Oleksandra Panasiuk & Anna Fensel
Vienna University of Economics and Business, Vienna, Austria
Simon Steyskal, Giray Havur & Sabrina Kirrane
Siemens AG Österreich, Vienna, Austria
Simon Steyskal & Giray Havur

Authors

Oleksandra Panasiuk
View author publications
You can also search for this author in PubMed Google Scholar
Simon Steyskal
View author publications
You can also search for this author in PubMed Google Scholar
Giray Havur
View author publications
You can also search for this author in PubMed Google Scholar
Anna Fensel
View author publications
You can also search for this author in PubMed Google Scholar
Sabrina Kirrane
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Oleksandra Panasiuk , Simon Steyskal , Giray Havur , Anna Fensel or Sabrina Kirrane .

Editor information

Editors and Affiliations

University of Bologna, Bologna, Italy
Aldo Gangemi
IBM Research - Almaden, San Jose, CA, USA
Anna Lisa Gentile
CNR-ISTC, Rome, Italy
Andrea Giovanni Nuzzolese
Technische Universität Dresden, Dresden, Germany
Sebastian Rudolph
Karlsruhe Institute of Technology, Karlsruhe, Germany
Maria Maleshkova
University of Mannheim, Mannheim, Germany
Heiko Paulheim
University of Aberdeen, Aberdeen, UK
Jeff Z Pan
CNR-ISTC, Rome, Italy
Mehwish Alam

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Panasiuk, O., Steyskal, S., Havur, G., Fensel, A., Kirrane, S. (2018). Modeling and Reasoning over Data Licenses. In: Gangemi, A., et al. The Semantic Web: ESWC 2018 Satellite Events. ESWC 2018. Lecture Notes in Computer Science(), vol 11155. Springer, Cham. https://doi.org/10.1007/978-3-319-98192-5_41

Download citation

DOI: https://doi.org/10.1007/978-3-319-98192-5_41
Published: 02 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98191-8
Online ISBN: 978-3-319-98192-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Modeling and Reasoning over Data Licenses

Abstract