1 Introduction

Modern data management has re-discovered the power and flexibility of graph-based representation formats, and so-called knowledge graphs are now used in many practical applications, e.g., in companies such as Google or Facebook. The shift towards graphs is motivated by the need for integrating knowledge from a variety of heterogeneous sources into a common format.

Description logics (DLs) seem to be an excellent fit for this scenario, since they can express complex schema information on graph-like models, while supporting incomplete information via the open world assumption. Ontology-based query answering has become an important research topic, with many recent results and implementations, and the W3C OWL and SPARQL standards provide a basis for practical adoption. One would therefore expect to encounter DLs in many applications of knowledge graphs.

However, this is not the case. While OWL is often used in RDF-based knowledge graphs developed in academia, such as DBpedia [4] and Bio2RDF [3], it has almost no impact on other applications of graph-structured data. This might in part be due to a format mismatch. Like DLs, many knowledge graphs use directed, labelled graph models, but unlike DLs they often add (sets of) annotations to vertices and edges. For example, the fact that Liz Taylor married Richard Burton can be described by an assertion \(\mathsf {spouse}(\mathsf {taylor},\mathsf {burton})\), but in practice we may also wish to record that they married in 1964 in Montreal, and that the marriage ended in 1974. We may write this as follows:

(1)

Such annotated graph edges today are widespread in practice. Prominent representatives include Property Graph, the data model used in many graph databases [19], and Wikidata, the knowledge graph used by Wikipedia [24]. Looking at Wikidata as one of the few freely accessible graphs outside academia, we obtain several requirements:

  • No single purpose. Annotations are used for many modelling tasks. Expected cases such as validity time and provenance are important, but are by far not the only uses, as (1) (taken from Wikidata) illustrates. Besides start, end, and location, over 150 other attributes are used at least 1000 times as annotations on Wikidata.

  • Multi-graphs. It can be necessary to include the same assertion multiple times with different annotations. For example, Wikidata in addition to (1) also includes the assertion \(\mathsf {spouse}(\mathsf {taylor},\mathsf {burton})\varvec{@}[{\mathsf {start}\mathrel {:}1975,\mathsf {end}\mathrel {:}1976}]\). Such multi-graphs are also supported by Property Graph, but not by logics with functional annotations, such as semi-ring approaches [9, 22] and aRDF [23].

  • Multi-attribute annotations. Wikidata (but not Property Graph) further supports annotations where the same attribute has more than one value. Among others, Wikidata includes, e.g., the assertion \(\mathsf {castMember}(\mathsf {Sesame\_Street},\mathsf {Frank\_Oz})\varvec{@}[{\mathsf {role}\mathrel {:}\mathsf {Bert}, \mathsf {role}\mathrel {:}\mathsf {Cookie\_Monster},\mathsf {role}\mathrel {:}\mathsf {Grover}}]\).

One can encode annotated (multi-)graphs as directed graphs, e.g., using reification [8], but DLs cannot express much over such a model. For example, one cannot say that the \(\mathsf {spouse}\) relation is symmetric, where annotations are the same in both directions [16]. Other traditional KR formalisms are similarly challenged in this situation.

In a recent work, we have therefore proposed to develop logics that support sets of attribute–value annotations natively [16]. The according generalisation of first-order logic, called multi-attribute predicate logic (MAPL), is expressive enough to capture weak second-order logic, making reasoning non-semi-decidable. For that reason, we have developed the Datalog-like MAPL rule language (MARPL) as a decidable fragment.

In this paper, we explore the use of description logics as a basis for decidable, and even tractable, fragments of MAPL. The resulting family of attributed DLs allows statements such as \(\mathsf {spouse}\varvec{@}X \sqsubseteq \mathsf {spouse}^-\varvec{@}X\) to say that spouse is symmetric. We introduce set variables (X in the example) to refer to annotations. We refer to variables to express constraints over annotations and to compare attribute values between them. A challenge is to add functionality of this type without giving up the nature of a DL.

Another challenge is that these extensions may greatly increase the complexity of DLs. We show that reasoning becomes 2ExpTime-complete for attributed \(\mathcal {ALCH} \), a prototypical DL; ExpTime-complete for attributed \(\mathcal {EL} \), a DL close to OWL 2 EL; and N2ExpTime-complete for attributed \(\mathcal {SROIQ}\), the DL underlying OWL 2 DL. Slight extensions of our DLs even lead to undecidability. We develop syntactic constraints to recover lower complexities, including PTime-completeness for attributed \(\mathcal {EL}\).

For readability, some proofs are only sketched out in this paper or have been omitted entirely. Full versions can be found in the technical report [14].

2 Attributed Description Logics

We introduce attributed description logics by defining the syntax and semantics of attributed \(\mathcal {ALCH}\), denoted \(\mathcal {ALCH_{@\varvec{+}}}\). This allows us to illustrate the central ideas without having to deal with the full generality of \(\mathcal {SROIQ}\), which we introduce in Sect. 6. We note that fact entailment can be polynomially reduced in the DLs we study.

2.1 Syntax and Intuition

We first give the syntax and intuitive semantics of \(\mathcal {ALCH_{@\varvec{+}}}\); the semantics will be formalised thereafter.

Example 1

We start with a guiding example, which will be formally explained when we define \(\mathcal {ALCH_{@\varvec{+}}}\). Wikidata contains assertions of the form \(\mathsf {educatedAt}(\mathsf {a\_person},\mathsf {a\_university})\varvec{@}[{\mathsf {start}\mathrel {:}\mathsf {2005},\mathsf {end}\mathrel {:}\mathsf {2009},\mathsf {degree}\mathrel {:}\mathsf {master}}]\). This motivates the following \(\mathcal {ALCH_{@\varvec{+}}}\) axiom:

$$\begin{aligned} X:\lfloor {\mathsf {degree}\mathrel {:}\mathsf {master}}\rfloor \quad&\big (\exists \mathsf {educatedAt}\varvec{@}X.\mathsf {University}&\sqsubseteq \mathsf {MSc}\varvec{@}[{\mathsf {start}\mathrel {:}X.\mathsf {end}}]\big ) \end{aligned}$$
(2)

The underlying DL axiom is \(\exists \mathsf {educatedAt}.\mathsf {University}\sqsubseteq \mathsf {MSc}\), stating that anybody educated at some university holds an M.Sc. Axiom (2) restricts this to \(\mathsf {educatedAt}\) assertions whose annotations X specify the degree to be a master, where X may contain further attribute–value pairs. Indeed, if X specifies an end date for the education, then this is used as a start for the entailed \(\mathsf {MSc}\) assertion. Similarly, we may express that a person that was \(\mathsf {educatedAt}\) some institution (where the \(\mathsf {degree}\) attribute has some value) obtained a degree from this institution:

$$\begin{aligned} \mathsf {educatedAt}\varvec{@}\lfloor {\mathsf {degree}\mathrel {:}\varvec{+}}\rfloor \sqsubseteq \mathsf {obtainedDegreeFrom} \end{aligned}$$
(3)

Attributed DLs are defined over the usual DL signature with sets of concept names \(\mathsf {N_{C}}\), role names \(\mathsf {N_{R}}\), and individual names \(\mathsf {N_{I}}\). In OWL terminology, concepts correspond to classes, roles correspond to properties, and individual names correspond to individuals. We consider an additional set \(\mathsf {N_{V}}\) of (set) variables. Following the definition of multi-attributed predicate logic (MAPL, [16]), we define annotation sets as finite binary relations, understood as sets of attribute–value pairs. In particular, attributes refer to domain elements and are syntactically denoted by individual names. To describe annotation sets, we introduce specifiers. The set \(\mathbf {S}\) of specifiers contains the following expressions:

  • set variables \(X\in \mathsf {N_{V}} \);

  • closed specifiers \([{a_{1} \mathrel {:}v_{1}, \ldots , a_{n} \mathrel {:}v_{n}}]\); and

  • open specifiers \(\lfloor {a_{1} \mathrel {:}v_{1}, \ldots , a_{n} \mathrel {:}v_{n}}\rfloor \),

where \(a_i \in \mathsf {N_{I}} \) and \(v_i\) is either \(\varvec{+}\), an individual name in \(\mathsf {N_{I}}\), or an expression of the form \(X.c\), with X a set variable in \(\mathsf {N_{V}} \) and c an individual name in \(\mathsf {N_{I}} \). Intuitively, closed specifiers define specific annotation sets whereas open specifiers merely provide lower bounds. We use \(\varvec{+}\) for “one or more” values, while \(X.c\) refers to the (finite, possibly empty) set of all values of attribute c in an annotation set X. A ground specifier is a specifier that does not contain expressions of the form \(X.c\).

Example 2

The open specifier \(\lfloor {\mathsf {degree}\mathrel {:}\mathsf {master}}\rfloor \) in Example 1 describes all annotation sets with at least the given attribute–value pair. The closed specifier \([{\mathsf {start}\mathrel {:}X.\mathsf {end}}]\) denotes the (unique) annotation set with \(\mathsf {start}\) as the only attribute, having exactly the values given for attribute \(\mathsf {end}\) in X.

The set \(\mathbf {R}\) of \(\mathcal {ALCH_{@\varvec{+}}}\) role expressions contains all expressions \(r\varvec{@}S\) with \(r\in \mathsf {N_{R}} \) and \(S\in \mathbf {S}\). The set \(\mathbf {C}\) of \(\mathcal {ALCH_{@\varvec{+}}}\) concept expressions is defined as follows

(4)

An \(\mathcal {ALCH_{@\varvec{+}}}\) concept (or role) assertion is an expression \(A(a)\varvec{@}S\) (or \(r(a,b)\varvec{@}S\)), with \(A\in \mathsf {N_{C}} \) (or \(r\in \mathsf {N_{R}} \)), \(a,b\in \mathsf {N_{I}} \), and \(S\in \mathbf {S}\) a specifier that is not a set variable. An \(\mathcal {ALCH_{@\varvec{+}}}\) concept inclusion is an expression of the form

$$\begin{aligned} {X_1}\!\varvec{:}\!{S_1}, \ldots , {X_n}\!\varvec{:}\!{S_n}\quad (C\sqsubseteq D), \end{aligned}$$
(5)

where \(C,D\in \mathbf {C}\) are \(\mathcal {ALCH_{@\varvec{+}}}\) concept expressions, \(S_1,\ldots , S_n\in \mathbf {S}\) are specifiers, and \(X_1,\ldots , X_n\in \mathsf {N_{V}} \) are set variables occurring in CD or in \(S_1,\ldots , S_n\). \(\mathcal {ALCH_{@\varvec{+}}}\) role inclusions are defined analogously, but with role expressions instead of the concept expressions. An \(\mathcal {ALCH_{@\varvec{+}}}\) ontology is a set of \(\mathcal {ALCH_{@\varvec{+}}}\) assertions, and role and concept inclusions.

To simplify notation, we omit the specifier \(\lfloor {}\rfloor \) (meaning “any annotation set”) in role or concept expressions, as done for \(\mathsf {University}\) in Example 1. In this sense, any \(\mathcal {ALCH}\) axiom is also an \(\mathcal {ALCH_{@\varvec{+}}}\) axiom. Moreover, we omit prefixes of the form \({X}\!\varvec{:}\!{\lfloor {}\rfloor }\), which merely state that X might be any annotation set.

We follow the usual DL notation for referring to other attributed DLs, where we add symbols to the DL name to indicate additional features, and remove symbols to indicate restrictions. Thus, \(\mathcal {ALC_{@\varvec{+}}} \) denotes \(\mathcal {ALCH_{@\varvec{+}}} \) without role hierarchies, and \(\mathcal {ALCH_{@}} \) corresponds to the fragment of \(\mathcal {ALCH_{@\varvec{+}}} \) that disallows \(\varvec{+}\) in specifiers.

2.2 Formal Semantics

As usual in DLs, an interpretation \(\mathcal {I}=\langle {\Delta ^\mathcal {I},\cdot ^\mathcal {I}}\rangle \) consists of a domain \(\Delta ^\mathcal {I}\) and an interpretation function \(\cdot ^\mathcal {I}\). Individual names \(c\in \mathsf {N_{I}} \) are interpreted as elements \(c^\mathcal {I}\in \Delta ^\mathcal {I}\). Concepts and roles are interpreted as relations that here include annotation sets:

  • \(A^\mathcal {I}\subseteq \Delta ^\mathcal {I}\times \mathcal {P}_{\textsf {fin}}\left( \Delta ^\mathcal {I}\times \Delta ^\mathcal {I}\right) \) for a concept \(A\in \mathsf {N_{C}} \), and

  • \(r^\mathcal {I}\subseteq (\Delta ^\mathcal {I}\times \Delta ^\mathcal {I})\times \mathcal {P}_{\textsf {fin}}\left( \Delta ^\mathcal {I}\times \Delta ^\mathcal {I}\right) \) for a role \(r\in \mathsf {N_{R}} \),

where \(\mathcal {P}_{\textsf {fin}}\left( \Delta ^\mathcal {I}\times \Delta ^\mathcal {I}\right) \) denotes the set of all finite binary relations over \(\Delta ^\mathcal {I}\). Expressions with free set variables are interpreted using variable assignments \(\mathcal {Z}:\mathsf {N_{V}} \rightarrow \mathcal {P}_{\textsf {fin}}\left( \Delta ^\mathcal {I}\times \Delta ^\mathcal {I}\right) \). For an interpretation \(\mathcal {I}\) and a variable assignment \(\mathcal {Z}\), we define the semantics of specifiers as follows:

where \(X\in \mathsf {N_{V}} \), \(a,a_i,b\in \mathsf {N_{I}} \), and \(v_i\) is \(\varvec{+}\), an element of \(\mathsf {N_{I}}\), or of the form X.a. We can now define the semantics of concept and role expressions:

(6)
(7)

Observe that we quantify existentially over admissible annotations here (“some \(\Psi \in S^{\mathcal {I},\mathcal {Z}}\)”). However, variables and closed specifiers without \(\varvec{+}\) are interpreted as singleton sets, so true existential quantification only occurs if S is an open specifier or if it contains \(\varvec{+}\). All other DL constructs can now be defined as usual, e.g., \((C\sqcap D)^{\mathcal {I},\mathcal {Z}}=C^{\mathcal {I},\mathcal {Z}}\cap D^{\mathcal {I},\mathcal {Z}}\), \((\exists r.C)^{\mathcal {I},\mathcal {Z}}=\{\delta \mid \text {there is }\langle {\delta ,\epsilon }\rangle \in r^{\mathcal {I},\mathcal {Z}}\text { with }\epsilon \in C^{\mathcal {I},\mathcal {Z}}\}\), and \((\lnot C)^{\mathcal {I},\mathcal {Z}} = \Delta ^{\mathcal {I}} \setminus C^{\mathcal {I}, \mathcal {Z}}\). Note that we do not include annotations on \(\top \), i.e. \(\top ^{\mathcal {I},\mathcal {Z}}=\Delta ^\mathcal {I}\), and similarly for \(\bot ^{\mathcal {I}, \mathcal {Z}} = \emptyset \).

Now \(\mathcal {I}\) satisfies an \(\mathcal {ALCH_{@\varvec{+}}}\) concept inclusion \(\alpha \) of the form (5), written \(\mathcal {I}\models \alpha \), if for all variable assignments \(\mathcal {Z}\) such that \(\mathcal {Z}(X_i)\in S_i^{\mathcal {I},\mathcal {Z}}\) for all \(i\in \{1,\ldots ,n\}\), we have \(C^{\mathcal {I},\mathcal {Z}}\subseteq D^{\mathcal {I},\mathcal {Z}}\). Satisfaction of role inclusions is defined analogously. Moreover, \(\mathcal {I}\) satisfies an \(\mathcal {ALCH_{@\varvec{+}}}\) concept assertion \(A(a)\varvec{@}S\) if \(\langle {a^\mathcal {I},\Psi }\rangle \in A^\mathcal {I}\) for some \(\Psi \in S^\mathcal {I}\) (the latter is well-defined since S contains no variables). \(\mathcal {I}\) satisfies an ontology if it satisfies all of its axioms. Based on this model theory, logical entailment is defined as usual.

Example 3

Consider the concept inclusion \(\alpha \) of Example 1 and the interpretation \(\mathcal {I}\) over domain \(\Delta ^{\mathcal {I}} = {\lbrace }{ \mathsf {Mary}, \mathsf {John}, \mathsf {TUD}, \mathsf {start}, \mathsf {end}, \mathsf {2017}, \mathsf {2018}, \mathsf {master}, \mathsf {degree}}{\rbrace }\), given by

$$\begin{aligned} \mathsf {MSc}^{\mathcal {I}} = \{&\langle {\mathsf {Mary}, {\lbrace }{\langle {\mathsf {start}, \mathsf {2016}}\rangle }{\rbrace }}\rangle , \langle {\mathsf {John}, {\lbrace }{\langle {\mathsf {start}, \mathsf {2017}}\rangle }{\rbrace }}\rangle \}, \\ \mathsf {educatedAt}^{\mathcal {I}} = \{&\langle {\mathsf {Mary}, \mathsf {TUD}, {\lbrace }{ \langle {\mathsf {degree}, \mathsf {master}}\rangle , \langle {\mathsf {end}, \mathsf {2016}}\rangle }{\rbrace } }\rangle ,\\&\langle {\mathsf {John}, \mathsf {TUD}, {\lbrace }{ \langle {\mathsf {degree}, \mathsf {master}}\rangle , \langle {\mathsf {end}, \mathsf {2017}}\rangle }{\rbrace } }\rangle \},\ \text {and}\\ \mathsf {University}^{\mathcal {I}} = \{&\langle {\mathsf {TUD}, {\lbrace }{}{\rbrace }}\rangle \}. \end{aligned}$$

Then \(\mathcal {I}\,\models \,\alpha \), i.e., \(\mathcal {I}\) satisfies \(\alpha \).

3 Expressivity of Attributed Description Logics

In this section, we clarify some basic semantic properties of attributed DLs and the general relation of attributed DLs to other logical formalisms. As a first observation, we note that already \(\mathcal {ALC_{@\varvec{+}}} \) is too expressive to be decidable:

Theorem 1

Satisfiability of attributed DLs with \(\varvec{+}\) is undecidable, even if the DL only supports \(\sqcap \), and supports either only open specifiers or only closed specifiers.

Proof

We reduce from the query answering problem for existential rules, i.e., first-order formulae of the form

$$\begin{aligned} \forall \varvec{x}.p_1(x^1_1,\ldots , x^1_{\textsf {ar}(p_1)})\wedge \ldots \wedge p_n(x^n_1,\ldots , x^n_{\textsf {ar}(p_n)})\rightarrow \exists \varvec{y}.p(z_1,\ldots ,z_{\textsf {ar}(q)}), \end{aligned}$$
(8)

where the variables \(x^i_j\) occur among the universally quantified variables, i.e., \(x^i_j\in \varvec{x}\), and variables \(z_i\) might be universally or existentially quantified, i.e., \(z_i\in \varvec{x}\cup \varvec{y}\). We require that each universally quantified variable occurs in some atom in the premise of the rule (safety), and that each existentially quantified variable occurs only once per rule. The latter is without loss of generality since rules that violate this restriction can be split into two rules using an auxiliary predicate. A fact is a formula of the form \(q(c_1,\ldots ,c_{\textsf {ar}(q)})\) with constants \(c_i\). Entailment of facts from given sets of facts and existential rules is known to be undecidable [2, 7].

To translate an existential rule of the form (8), we consider DL concept names \(P_{(i)}\) for each predicate symbol \(p_{(i)}\), and individual names \(a_1, \ldots , a_\ell \), where \(\ell \) is the maximal arity of any such predicate. For each universally quantified variable x, let \(\pi _x=\langle {p_i,k}\rangle \) be an (arbitrary but fixed) position at which x occurs, i.e., for which \(x=x^i_k\). The rule can now be rewritten to the attributed DL axiom

$$ {X_1}\!\varvec{:}\!{S_1}, \ldots , {X_n}\!\varvec{:}\!{S_n} \quad \left( P_1\varvec{@}X_1\sqcap \ldots \sqcap P_n\varvec{@}X_n \sqsubseteq P\varvec{@}T\right) , $$

where the specifiers are defined as \(S_i=[a_j \mathrel {:}X_m.a_k \mid 1 \le j \le \textsf {ar}(p_{i}) \text { and } \pi _{x^i_j}=\langle {p_m,k}\rangle ]\) and \(T=[{ a_j\mathrel {:}\varvec{+}\mid z_j\in \varvec{y}}]\cup [{ a_j\mathrel {:}X_m.a_k\mid z_j\in \varvec{x}\text { and }\pi _{z_j}=\langle {p_m,k}\rangle }]\) (note that we slightly abuse \(\mid \) and \(\cup \) here for a simpler presentation). For example, the rule \(\forall xy . p_1(x,y)\wedge p_2(y,x)\rightarrow \exists z . p(x,z)\) is translated into the concept inclusion \({X_1}\!\varvec{:}\!{S_1},{X_2}\!\varvec{:}\!{S_2} \ \left( P_1\varvec{@}X_1\sqcap P_2\varvec{@}X_2 \sqsubseteq P\varvec{@}[{a_1 \mathrel {:}X_1.a_1, a_2 \mathrel {:}\varvec{+}}]\right) ,\) where \(S_1=[{a_1 \mathrel {:}X_1.a_1, a_2 \mathrel {:}X_2.a_1}]\) and \(S_2 = [{a_1 \mathrel {:}X_2.a_1, a_2 \mathrel {:}X_1.a_1}]\). Observe that the specifier \(S_i\) for \(X_i\) may contain assignments of the form \(a_j\mathrel {:}X_i.a_j\): by our semantics, this merely states that \(a_j\) may have zero or more values. Facts of the form \(q(c_1,\ldots ,c_m)\) can be translated into assertions \(Q(b)\varvec{@}[{a_1\mathrel {:}c_1,\ldots ,a_m\mathrel {:}c_m}]\) for an individual name b that is used in all such assertions.

Entailment of facts is preserved in this translation. Correctness is retained if we replace all closed by open specifiers, since the translated ontology admits a least model where all annotation sets are interpreted as the smallest possible sets.    \(\square \)

In Sects. 4 and 5, we present two approaches for overcoming the undecidability of Theorem 1, namely to exclude \(\varvec{+}\) from attributed DLs, and to restrict the use of expressions of the form \(X.a\).

Example 4

It follows from Theorem 1 that \(\mathcal {ALC_{@\varvec{+}}}\) ontologies may require models with annotation sets of unbounded size. To see this, consider the following ontology:

$$\begin{aligned}&A(b)\varvec{@}\lfloor {c\mathrel {:}c}\rfloor \end{aligned}$$
(9)
$$\begin{aligned} A\varvec{@}X \sqsubseteq {}&\exists r.A\varvec{@}\lfloor {c\mathrel {:}\varvec{+}, p\mathrel {:}X.c, p\mathrel {:}X.p}\rfloor \end{aligned}$$
(10)
$$\begin{aligned} A\varvec{@}X \sqcap A\varvec{@}\lfloor {p\mathrel {:}X.c}\rfloor \sqsubseteq {}&\bot \end{aligned}$$
(11)

Axiom (9) defines an initial A member. Axiom (10) states that all A members have an r successor that is in A, annotated with some value for c (“current”), and values for p (“previous”) that include all of its predecessor’s c and p values. Axiom (11) requires that no individual in A may have a set of p values that include all of its c values. It is not hard to see that all models of this ontology include an infinite r-chain with arbitrarily large (but finite) A-related annotations sets.

It is interesting to discuss Theorem 1 in the context of our previous work on multi-attributed predicate logic (MAPL), which generalises first-order logic with annotation sets for arbitrary predicates. Indeed, our interpretations for attributed DLs are a special case of multi-attributed relational structures (MARS), though we do not make the unique name assumption here, since it is not common for the DLs we consider. Otherwise, attributed DLs are fragments of MAPL. Our notation \(X.a\) is new, but it can be simulated in MAPL, e.g., by using function definitions [16].

MAPL is not semi-decidable, and we have proposed MAPL rules (MARPL) as a decidable fragment. MARPL supports \(\varvec{+}\) without restrictions, and it includes arbitrary predicate arities and more expressive specifiers (with some form of negation). In contrast, attributed DLs add the ability to quantify existentially over annotations, and therefore to derive partially specified annotation sets, which is the main reason for Theorem 1. In general, attributed DLs are based on the open world assumption, whereas MARPL could equivalently be interpreted under a closed world, least model semantics. Nevertheless, even without \(\varvec{+}\) the translation from the proof of Theorem 1 allows attributed DLs to capture rule languages, as the following result shows. Here, by Datalog we mean first-order Horn logic without existential quantifiers.

Theorem 2

Attributed DLs can capture Datalog in the sense that every set \(\mathbb {P}\) of Datalog rules and fact \(q(c_1,\ldots ,c_m)\) can be translated in linear time into an attributed DL ontology \(\textit{KB}_\mathbb {P}\) and assertion \(Q(b)\varvec{@}S\), such that \(\mathbb {P}\models q(c_1,\ldots ,c_m)\) iff \(\textit{KB}_\mathbb {P}\models Q(b)\varvec{@}S\). This translation requires just \(\sqcap \), no \(\varvec{+}\), and either only open or only closed specifiers.

The ability to capture Datalog reminds us of nominal schemas, the extension of DLs with “variable nominals” [13, 15]. Indeed, this extension can also be captured in attributed DLs (we omit the details here). The converse is not true, e.g., since nominal schemas cannot encode annotation sets on role assertions. Role inclusion axioms such as \(\mathsf {spouse}\varvec{@}X \sqsubseteq \mathsf {spouse}^-\varvec{@}X\) are therefore impossible. Another related formalism is DL-Lite\(_A\), which supports (data) annotations on domain elements and pairs of domain elements [5]. This extension of DLs supports some forms of ternary relations. Nevertheless, the use case and complexity properties of DL-Lite\(_A\) are different from the logics we study here, and it remains for future work to further explore attributed DL-Lite in more detail.

4 Reasoning in \(\mathcal {ALCH_{@}}\)

We first focus on \(\mathcal {ALCH_{@}}\), for which we show reasoning to be decidable, albeit at a higher complexity. For a first positive result, we consider ground \(\mathcal {ALCH_{@}}\), where ontologies do not contain any set variables. We show that we can translate any ground \(\mathcal {ALCH_{@}}\) ontology into an equisatisfiable \(\mathcal {ALCH}\) ontology by introducing fresh names for annotated concept and role names. This renaming is one of the key ingredients in obtaining decision procedures for attributed DLs.

Theorem 3

Satisfiability of ground \(\mathcal {ALCH_{@}}\) ontologies is ExpTime-complete.

Proof

Hardness is immediate since \(\mathcal {ALCH_{@}}\) generalises \(\mathcal {ALCH}\). For membership, we reduce \(\mathcal {ALCH_{@}}\) satisfiability to \(\mathcal {ALCH}\) satisfiability. Given an \(\mathcal {ALCH_{@}}\) ontology \(\textit{KB}\), let \(\textit{KB}^\dagger \) denote the \(\mathcal {ALCH}\) ontology that is obtained by replacing each annotated concept name \(A\varvec{@}S\) with a fresh concept name \(A_{S}\), and each annotated role name \(r\varvec{@}S\) with a fresh role name \(r_{S}\), respectively. We then extend \(\textit{KB}^\dagger \) by all axioms

$$\begin{aligned} A_{S}&\sqsubseteq A_{T},&\text {where } A_{S} \text { and } A_{T} \text { occur in translated axioms of } \textit{KB}^\dagger , \text { and } \end{aligned}$$
(12)
$$\begin{aligned} r_{S}&\sqsubseteq r_{T},&\text {where } r_{S} \text { and } r_{T} \text { occur in translated axioms of } \textit{KB}^\dagger \end{aligned}$$
(13)

such that T is an open specifier, and the set of attribute–value pairs \(a\mathrel {:}b\) in S is a superset of the set of attribute–value pairs in T. We show that \(\textit{KB}\) is satisfiable iff \(\textit{KB}^\dagger \) is satisfiable. The claim then follows from the well-known ExpTime-completeness of satisfiability checking in \(\mathcal {ALCH}\). Given an \(\mathcal {ALCH_{@}}\) model \(\mathcal {I}\) of \(\textit{KB}\), we directly obtain an \(\mathcal {ALCH}\) interpretation \(\mathcal {J}\) over \(\Delta ^{\mathcal {I}}\) by undoing the renaming and applying \(\mathcal {I}\), i.e., by mapping \(A_{S} \in \mathsf {N_{C}} \) to \(A\varvec{@}S^{\mathcal {I}}\), \(r_{S} \in \mathsf {N_{R}} \) to \(r\varvec{@}S^{\mathcal {I}}\), and \(a \in \mathsf {N_{I}} \) to \(a^{\mathcal {I}}\). Clearly, \(\mathcal {J}\models \textit{KB}^\dagger \). Conversely, given an \(\mathcal {ALCH}\) model \(\mathcal {J}\) of \(\textit{KB}^{\dagger }\), we construct an \(\mathcal {ALCH_{@}}\)-interpretation \(\mathcal {I}\) over domain \(\Delta ^{\mathcal {I}} = \Delta ^{\mathcal {J}} \cup {\lbrace }{\star }{\rbrace }\), where \(\star \) is a fresh individual name, and define for all \(a \in \mathsf {N_{I}} \). For a ground closed specifier \(S = [{a_{1}\mathrel {:}b_{1}, \ldots , a_{n}\mathrel {:}b_{n}}]\), we set . Similarly, for a ground open specifier \(S = \lfloor {a_{1}\mathrel {:}b_{1}, \ldots , a_{n}\mathrel {:}b_{n}}\rfloor \), we define . Furthermore, let and . Then \(\mathcal {I}\models \textit{KB}\), where \(\star \) ensures that axioms such as \(\top \sqsubseteq A\varvec{@}\lfloor {a\mathrel {:}b}\rfloor \sqcap \lnot A\varvec{@}[{a\mathrel {:}b}]\) remain satisfiable.    \(\square \)

The other important technique for dealing with attributed DLs is grounding, where we eliminate set variables from an ontology, thus transforming it into a ground ontology. As illustrated by the next result, this grounding may lead to an ontology of exponentially larger size, resulting in an increased complexity of reasoning.

Theorem 4

Satisfiability of \(\mathcal {ALCH_{@}}\) ontologies is in 2ExpTime.

Proof

Let \(\textit{KB}\) be an \(\mathcal {ALCH_{@}}\) ontology, and let \(\mathsf {N_I^{\textit{KB}}} \) the set of individual names occurring in \(\textit{KB}\), extended by one fresh individual name \(x\). The grounding \(\textsf {ground}(\textit{KB})\) of \(\textit{KB}\) consists of all assertions in \(\textit{KB}\), together with grounded versions of inclusion axioms. Let \(\mathcal {I}\) be an interpretation over domain \(\Delta ^{\mathcal {I}} = \mathsf {N_I^{\textit{KB}}} \) satisfying \(a^{\mathcal {I}} = a\) for all \(a \in \mathsf {N_I^{\textit{KB}}} \), and \(\mathcal {Z}: \mathsf {N_{V}} \rightarrow \mathcal {P}_{\textsf {fin}}\left( \Delta ^{\mathcal {I}} \times \Delta ^{\mathcal {I}}\right) \) be a variable assignment. Consider a concept inclusion \(\alpha \) of the form \({X_1}\!\varvec{:}\!{S_1}, \ldots , {X_n}\!\varvec{:}\!{S_n}\ (C\sqsubseteq D)\). We say that \(\mathcal {Z}\) is compatible with \(\alpha \) if \(\mathcal {Z}(X_i)\in S_i^{\mathcal {I},\mathcal {Z}}\) for all \(1 \le i \le n\). In this case, the \(\mathcal {Z}\) -instance \(\alpha _{\mathcal {Z}}\) of \(\alpha \) is the concept inclusion \(C' \sqsubseteq D'\) obtained by

  • replacing each variable \(X_{i}\) with \([{a \mathrel {:}b \mid \langle {a, b}\rangle \in \mathcal {Z}(X_{i})}]\), and

  • replacing every assignment \(a \mathrel {:}X_{i}.b\) occurring in some specifier by all assignments \(a \mathrel {:}c\) such that \(\langle {b, c}\rangle \in \mathcal {Z}(X_{i})\).

Then \(\textsf {ground}(\textit{KB})\) contains all \(\mathcal {Z}\)-instances \(\alpha _{\mathcal {Z}}\) for all concept inclusions \(\alpha \) in \(\textit{KB}\) and all compatible variable assignments \(\mathcal {Z}\); and analogous axioms for role inclusions. In general, there may be exponentially many different instances for each terminological axiom in \(\textit{KB}\), thus \(\textsf {ground}(\textit{KB})\) is of exponential size. We conclude the proof by showing that \(\textit{KB}\) is satisfiable iff \(\textsf {ground}(\textit{KB})\) is satisfiable, the result then follows from Theorem 3. By construction, we have \(\textit{KB}\models \textsf {ground}(\textit{KB})\), i.e., any model of \(\textit{KB}\) is also a model of \(\textsf {ground}(\textit{KB})\). Conversely, let \(\mathcal {I}\) be a model of \(\textsf {ground}(\textit{KB})\). Without loss of generality, assume that \(x^{\mathcal {I}} \ne a^{\mathcal {I}}\) for all \(a \in \mathsf {N_I^{\textit{KB}}} \setminus \{x\}\) (it suffices to add a fresh individual since x does not occur in \(\textit{KB}\)). For an annotation set \(\Psi \in \mathcal {P}_{\textsf {fin}}\big (\Delta ^{\mathcal {I}} \times \Delta ^{\mathcal {I}}\big )\), we define \(\mathop {{\text {rep}}}_{x}\!{\left( {\Psi }\right) }\) to be the annotation obtained from \(\Psi \) by replacing any individual \(\delta \not \in \mathcal {I}(\mathsf {N_I^{\textit{KB}}})\) in \(\Psi \) by \(x^{\mathcal {I}}\). We let \(\sim \) be the equivalence relation induced by \(\mathop {{\text {rep}}}_{x}\!{\left( {\Psi }\right) } = \mathop {{\text {rep}}}_{x}\!{\left( {\Phi }\right) }\) and define an interpretation \(\mathcal {J}\) over domain , where for \(A \in \mathsf {N_{C}} \), for \(r \in \mathsf {N_{R}} \), and for all individual names \(a \in \mathsf {N_{I}} \). It remains to show that \(\mathcal {J}\) is indeed a model of \(\textit{KB}\). Suppose for a contradiction that there is a concept inclusion \(\alpha \) that is not satisfied by \(\mathcal {J}\) (the case for role inclusions is analogous). Then we have some compatible variable assignment \(\mathcal {Z}\) that leaves \(\alpha \) unsatisfied. Let \({\mathcal {Z}}_{x}\) be the variable assignment \(X \mapsto \mathop {{\text {rep}}}_{x}\!{\left( {\mathcal {Z}(X)}\right) }\) for all \(X\in \mathsf {N_{V}} \). Clearly, \(\mathcal {Z}_{x}\) is also compatible with \(\alpha \). But now we have \(C^{\mathcal {J}, \mathcal {Z}} = C^{\mathcal {I}, \mathcal {Z}_{x}}\) for all \(\mathcal {ALCH_{@}}\) concepts C, yielding the contradiction \(\mathcal {I}\not \models \alpha _{\mathcal {Z}_{x}}\).    \(\square \)

We regain decidability for \(\mathcal {ALC_{@\varvec{+}}}\) by disallowing expressions of the form \(X.a\).

Theorem 5

Satisfiability of \(\mathcal {ALCH_{@\varvec{+}}}\) ontologies without expressions of the form \(X.a\) is in 2ExpTime.

Proof

We reduce satisfiability in \(\mathcal {ALCH_{@\varvec{+}}}\) (without expressions of the form \(X.a\)) to satisfiability in \(\mathcal {ALCH}\), similar to the proof of Theorem 4. Consider an \(\mathcal {ALCH_{@\varvec{+}}}\) ontology \(\textit{KB}\) that contains the individual names \(\mathsf {N_I^{\textit{KB}}} \), along with two fresh individual names \(x\) and \(x_{\varvec{+}}\). The grounding proceeds as in the proof of Theorem 4, except that for \(\mathcal {Z}\)-instances \(\alpha _{\mathcal {Z}}\) of concept inclusions \(\alpha \), we additionally replace each assignment \(a\mathrel {:}\varvec{+}\) occurring in some specifier by the assignment \(a\mathrel {:}x_{\varvec{+}}\). The exponentially large grounding again yields containment in 2ExpTime. From a model \(\mathcal {J}\) of \(\textit{KB}\), we obtain a model \(\mathcal {I}\) of \(\textsf {ground}(\textit{KB})\) by setting , for \(a \in \mathsf {N_{I}} \setminus {\lbrace }{x,x_{\varvec{+}}}{\rbrace }\), , , for \(A \in \mathsf {N_{C}} \), and for \(r \in \mathsf {N_{R}} \). Clearly, if \(\mathcal {J}\) satisfies a concept inclusion in \(\textit{KB}\), then \(\mathcal {I}\) satisfies a corresponding concept inclusion in \(\textsf {ground}(\textit{KB})\). Similarly, any concept inclusion satisfied by \(\mathcal {I}\) must correspond to a concept inclusion satisfied by \(\mathcal {J}\) since \(x_{\varvec{+}}\) does not occur in \(\textit{KB}\). The converse direction follows immediately from the proof of Theorem 4. \(\square \)

Both of these upper bounds are tight, as the next theorem shows:

Theorem 6

Checking satisfiability of \(\mathcal {ALC_@}\) ontologies without expressions of the form \(X.a\) is 2ExpTime-hard.

Proof

(sketch). We reduce the word problem for exponentially space-bounded alternating Turing machines (ATMs) [6] to the entailment problem for \(\mathcal {ALC_@}\) ontologies. We construct the tree of all configurations reachable from the initial configuration, encoding the transitions in the edges of the tree, i.e., each configuration is represented by an individual. The tape cells are represented as concepts carrying an annotation encoding the cell content and position (as a binary number). We mark the current head position with an additional concept, allowing us to copy each non-head position of the tape to successors in the configuration tree, while changing the tape cell at the head position and moving the head depending on the transition from the preceding configuration. As acceptance of a given configuration depends solely on the state and the successor configurations, we can propagate acceptance backwards from the leaves of the configuration tree to the initial configuration.    \(\square \)

5 Tractable Reasoning in Attributed \(\mathcal {EL}\)

In this section, we investigate \(\mathcal {ALC_@}\) fragments based on the \(\mathcal {EL}\) family of description logics. This family includes \(\mathcal {EL}^{\mathord {+}\mathord {+}}\), which forms the logical foundation of the OWL 2 EL profile and is widely used in applications such as in SNOMED CT [21], a clinical terminology with global scope. SNOMED CT also features a compositional syntax [20], which has recently been augmented with attribute sets allowing arbitrary concrete values. While concept expressions in either of the syntaxes can be translated into the other, \(\mathcal {EL}^{\mathord {+}\mathord {+}}\) provides no such attributes (i.e., concepts with attribute sets have to be represented by introducing new concept names). We can not only capture these attributes using our attribute–value sets, but also include them into the reasoning process. As a (simplified) example, the concept of a 500 mg Paracetamol tablet could be annotated with

$$\begin{aligned} \lfloor { \mathsf {strengthMagnitude}\mathrel {:}500, \mathsf {tradeName}\mathrel {:}\mathsf {PANADOL} }\rfloor . \end{aligned}$$

The basic logic is \(\mathcal {EL_{@}}\), the fragment of \(\mathcal {ALC_@}\) which uses only \(\exists \), \(\sqcap \), \(\top \) and \(\bot \) in concept expressions. Unfortunately, Theorem 2 shows that \(\mathcal {EL_{@}}\) is ExpTime-complete, even with severe syntactic restrictions. To overcome this source of complexity, we impose a bound on the number of set variables per concept inclusion and exclude \(X.a\):

Theorem 7

Let \(\ell \in \mathbb {N} \). Checking satisfiability of \(\mathcal {EL_{@}}\) ontologies with at most \(\ell \) variables per axiom, and without expressions of the form \(X.a\) is PTime-complete.

Proof

Hardness follows from the PTime-hardness of \(\mathcal {EL}\) [1]. For membership, we polynomially reduce \(\mathcal {EL_{@}}\) satisfiability to \(\mathcal {ELH}\) satisfiability. Indeed, the grounding used in Theorem 4 can be restricted to annotation sets that are described in (ground) specifiers that are found in the ontology, since no new sets can be derived without \(X.a\). The bounded number of variables then ensures that the grounding remains polynomial. Since neither grounding nor renaming introduce negation, the resulting ontology belongs to the \(\mathcal {ELH}\) fragment of \(\mathcal {ALCH}\).    \(\square \)

Observe that we can allow some uses of \(X.a\), given that we obey certain restrictions:

Theorem 8

Let \(\ell , k\in \mathbb {N} \). Checking satisfiability of \(\mathcal {EL_{@}}\) ontologies is PTime-complete if all of the following conditions are satisfied:

  1. (A)

    axioms contain at most \(\ell \) variables,

  2. (B)

    any closed or open specifier contains at most k expressions of the form \(X.a\), and,

  3. (C)

    if any specifier contains an assignment \(a \mathrel {:}X.b\), then it does not contain any other assignment for attribute a.

Proof

As in the proof of Theorem 7, we can obtain a polynomial grounding, but we may need to consider annotation sets that are not explicitly specified in the original ontology. But, due to condition (C), as the set of values for any attribute we only need to consider one of the polynomially many sets of values given explicitly through ground assignments in specifiers. Considering any combination of these value sets for any of the at most k attributes that use \(X.a\) in assignments results in polynomially many annotation sets.    \(\square \)

We now show that violating any of these conditions makes satisfiability intractable.

Theorem 9

Let \(\textit{KB}\) be an \(\mathcal {EL_{@}}\) ontology and consider conditions (A)–(C) of Theorem 8 with \(\ell = 1\) and \(k = 2\). Then deciding satisfiability of \(\textit{KB}\) is

  1. (1)

    ExpTime-hard if \(\textit{KB}\) satisfies only conditions (B) and (C),

  2. (2)

    ExpTime-hard if \(\textit{KB}\) satisfies only conditions (A) and (C), and

  3. (3)

    PSpace-hard if \(\textit{KB}\) satisfies only conditions (A) and (B).

It is an open question whether the PSpace bound in the third case is tight. Nevertheless, it implies intractability for this case. Finally, we show that also \(\mathcal {EL_{@\varvec{+}}}\) (without \(X.a\)) is intractable (recall that \(\mathcal {EL_{@\varvec{+}}}\) with \(X.a\) is already undecidable by Theorem 1).

Theorem 10

Checking satisfiability of \(\mathcal {EL_{@\varvec{+}}}\) ontologies without expressions of the form \(X.a\) is ExpTime-complete.

Proof

ExpTime-hardness follows from Theorem 9. From the proof of Theorem 5, we obtain an exponentially large grounding, which, together with the PTime complexity of \(\mathcal {ELH}\), yields the ExpTime upper bound. \(\square \)

6 Attributed OWL

In this section, we consider attributed DLs with further expressive features, so that in particular we can cover all of the expressivity of the OWL 2 DL ontology language [17]. The underlying DL is \(\mathcal {SROIQ_{@}}\), which we introduce next by slightly extending our earlier definition of \(\mathcal {ALCH_{@}}\). The set \(\mathbf {R}\) of \(\mathcal {SROIQ_{@}}\) role expressions contains all expressions \(r\varvec{@}S\) and \(r^-\varvec{@}S\) with \(r\in \mathsf {N_{R}} \) and \(S\in \mathbf {S}\). The set \(\mathbf {C}\) of \(\mathcal {SROIQ_{@}}\) concept expressions is defined as follows

(14)

The new features are nominals \(\{c\}\), which denote concepts containing one individual, and number restrictions \(\mathord {\leqslant }n\,R.C\) and \(\mathord {\geqslant }n\,R.C\), which express concepts of elements with at most/at least \(n\ge 0\) R-successors in C. Note that we do not include annotations on nominals. This is no real restriction, since one can use axioms such as \(\{c\}\equiv A_c\varvec{@}\lfloor {}\rfloor \) to introduce a concept name \(A_c\) that may hold such annotations. This allows us to use the same notion of interpretation as for \(\mathcal {ALCH_{@}}\). Assertions, concept and role inclusions are defined as before, based on these extended sets of expressions. In addition, \(\mathcal {SROIQ_{@}}\) supports complex role inclusion axioms of the form

$$\begin{aligned} {X_1}\!\varvec{:}\!{S_1}, \ldots , {X_n}\!\varvec{:}\!{S_n}\quad (R_1\circ \ldots \circ R_\ell \sqsubseteq T), \end{aligned}$$
(15)

where \(R_i,T\in \mathbf {R}\) are \(\mathcal {SROIQ_{@}}\) role expressions, \(S_1,\ldots , S_n\in \mathbf {S}\) are specifiers, and \(X_1,\ldots , X_n\in \mathsf {N_{V}} \) are set variables occurring among \(R_i, T,S_1,\ldots , S_n\). A \(\mathcal {SROIQ_{@}}\) ontology is a set of \(\mathcal {SROIQ_{@}}\) assertions, and role and concept inclusions.

The semantics of these constructs and axioms is defined as usual [10], where the interpretation of roles and concepts takes annotations into account as in Sect. 2. For instance, we may express that any drug, such as a Paracetamol tablet, that contains at most one active ingredient and a certain amount of some such ingredient, such as 500 mg of Acetaminophen, has the same dose:

$$\begin{aligned} {X}\!\varvec{:}\!{\lfloor {}\rfloor } \ \ \mathsf {Drug}\sqcap \mathord {\leqslant }1\,\mathsf {hasActiveIngredient}.\top \sqcap {}\exists \mathsf {hasActiveIngredient}\varvec{@}X.\top \sqsubseteq {}\\ \mathsf {Drug}\varvec{@}\lfloor {\mathsf {strengthMagnitude}\mathrel {:}X.\mathsf {strengthMagnitude}}\rfloor \end{aligned}$$

To ensure decidability of reasoning, \(\mathcal {SROIQ}\) imposes two additional restrictions on ontologies: simplicity and regularity [10]. We adopt them to \(\mathcal {SROIQ_{@}}\) as follows.

Simplicity is defined as in \(\mathcal {SROIQ}\), ignoring the annotations. The set of non-simple roles \(\mathsf {N_R^n} \subseteq \mathsf {N_{R}} \) w.r.t. a \(\mathcal {SROIQ_{@}}\) ontology is defined recursively: \(t\in \mathsf {N_R^n} \) if t occurs on the right of an axiom of form (15) and either (1) \(\ell >1\) or (2) some non-simple role \(s\in \mathsf {N_R^n} \) occurs on the left of the axiom. All other role names are simple. We now require that only simple roles occur in R in number restrictions \(\mathord {\leqslant }n\,R.C\) and \(\mathord {\geqslant }n\,R.C\).

A \(\mathcal {SROIQ_{@}}\) ontology is regular if there is a strict partial order \(\prec \) on the set \(\mathsf {N_R^\pm } =\mathsf {N_{R}} \cup \{r^-\mid r\in \mathsf {N_{R}} \}\), such that

  1. (1)

    for all \(R\in \mathsf {N_R^\pm } \) and \(s\in \mathsf {N_{R}} \), we have \(s\prec R\) iff \(s^-\prec R\), and

  2. (2)

    for all role inclusion axioms of form (15), the inclusion \(R_1\circ \ldots \circ R_\ell \sqsubseteq T\) has one of the following forms:

    $$\begin{aligned} T\varvec{@}S\circ T\varvec{@}S&\sqsubseteq T\varvec{@}S&R_1\circ \ldots \circ R_{\ell -1}\circ T\varvec{@}S&\sqsubseteq T\varvec{@}S&r^-\varvec{@}S\sqsubseteq r\varvec{@}S\\ R_1\circ \ldots \circ R_\ell&\sqsubseteq T\varvec{@}S&T\varvec{@}S\circ R_2\circ \ldots \circ R_\ell&\sqsubseteq T\varvec{@}S \end{aligned}$$

    where \(S\in \mathbf {S}\), \(T\in \mathsf {N_R^\pm } \), \(r\in \mathsf {N_{R}} \), and \(R_1,\ldots ,R_\ell \in \mathbf {R}\) are of form \(R_1\varvec{@}S_1,\ldots ,R_\ell \varvec{@}S_\ell \) such that \(R_i\prec T\) for all \(i\in \{1,\ldots ,\ell \}\).

Note that we adopt the usual conditions from \(\mathcal {SROIQ}\) for (inverted) role names, and further require that cases with the same role T on both sides use the same specifier S. As for \(\mathcal {SROIQ}\), this condition can be verified in polynomial time by computing a minimal relation \(\prec \) that satisfies the conditions and checking if it is a strict partial order.

For reasoning, the step from \(\mathcal {ALCH_{@}}\) to \(\mathcal {SROIQ_{@}}\) leads to several difficulties. First, nominals and cardinality restrictions may lead to the entailment of equalities \(a\approx b\), which has consequences on annotation sets (e.g., \(A\varvec{@}\lfloor {c\mathrel {:}a}\rfloor \equiv A\varvec{@}\lfloor {c\mathrel {:}b}\rfloor \) in this case). For obtaining complexity upper bounds by transformation to standard DLs as in Sect. 4, we need to axiomatise such relationships. Second, nominals may be used to restrict the overall size of the domain, e.g., when stating \(\top \sqsubseteq \{a\}\). Besides the entailment of further equalities, this also changes the semantics of open specifiers (e.g., we obtain \(A\varvec{@}\lfloor {a\mathrel {:}a}\rfloor \sqsubseteq A\varvec{@}[{a\mathrel {:}a}]\) in this case). As before, this requires suitable axiomatisation in \(\mathcal {SROIQ}\). Either of these two effects may require exponentially many auxiliary axioms, leading to an N3ExpTime upper bound even for ground \(\mathcal {SROIQ_{@}}\). However, we will show an N2ExpTime upper bound as for \(\mathcal {SROIQ}\), which is tight.

Theorem 11

Satisfiability of ground \(\mathcal {SROIQ_{@}}\) ontologies is in N2ExpTime.

To prove this theorem, we first translate ground \(\mathcal {SROIQ_{@}}\) into an auxiliary DL, called \(\mathcal {SROIQ} _\approx \), and then show how to reason in this DL by an exponential reduction to \(\mathcal {C}^2\), the two-variable fragment with counting [18], which yields the desired N2ExpTime upper bound. The second part of the proof is split over several lemmas.

\(\mathcal {SROIQ} _\approx \), in addition to the usual \(\mathcal {SROIQ}\) axioms, supports concept inclusions of the form \(a\approx b \Rightarrow C\sqsubseteq D\) and role inclusions of the form \(a\approx b \Rightarrow R_1\circ \ldots \circ R_\ell \sqsubseteq T\). An axiom \(a\approx b \Rightarrow \alpha \) is satisfied by interpretation \(\mathcal {I}\) if either \(a^\mathcal {I}\ne b^\mathcal {I}\) or \(\mathcal {I}\models \alpha \).

The translation from a ground \(\mathcal {SROIQ_{@}}\) ontology \(\textit{KB}\) to a \(\mathcal {SROIQ} _\approx \) ontology \(\textit{KB}^\ddagger \) now proceeds as for ground \(\mathcal {ALCH_{@}}\), by replacing annotated concept names \(A\varvec{@}S\) by new names \(A_S\), and likewise for roles. However, we now introduce names \(A_S\in \mathsf {N_{C}} \) and \(r_S\in \mathsf {N_{R}} \) for all possible open and closed ground specifiers over the set of individual names in \(\textit{KB}\), as opposed to only those occurring in \(\textit{KB}\). We then add two families of axioms for capturing the aforementioned effects. First, to handle individual equality, for each \(A \in \mathsf {N_{C}} \) and \(r \in \mathsf {N_{R}} \), we add axioms \(a\approx b \Rightarrow A_S\sqsubseteq A_T\) and \(a\approx b \Rightarrow r_S\sqsubseteq r_T\) for every pair ST of ground specifiers that are either both open or both closed, and where the sets of pairs in S and T are the same when replacing each occurrence of a by b. Second, to handle bounded domain size, we consider an individual name z not occurring in \(\textit{KB}\). Entailments of the form \(z\approx a\) will be used to detect the bounded domain case. We can formalise this effect by axioms \(z\approx a\Rightarrow \top \sqsubseteq \bigsqcup _{c\in \mathsf {N_I^{\textit{KB}}}} \{c\}\), where \(\mathsf {N_I^{\textit{KB}}} \) is the set of individual names occurring in \(\textit{KB}\) for all \(a\in \mathsf {N_I^{\textit{KB}}} \). To handle specifiers in this situation, we add axioms of the form

$$\begin{aligned} z\approx a \Rightarrow A_S&\sqsubseteq \bigsqcup _{T\supseteq _c S} A_T&\text {for all } A\in \mathsf {N_{C}} \text { in } \textit{KB}\text { and } a\in \mathsf {N_I^{\textit{KB}}} \end{aligned}$$
(16)

where S is a ground open specifier and \(T\supseteq _c S\) holds whenever T is a ground closed specifier that contains all attribute–value pairs in S. We would need a similar axiom as (16) for roles, but this would require disjunctions of arbitrary roles, which is not supported in \(\mathcal {SROIQ}\). However, since these axioms only are necessary when all elements in the domain of interpretation are the interpretation of some individual name in \(\mathsf {N_I^{\textit{KB}}} \), we can instead use concept inclusions as follows:

$$\begin{aligned} z\approx a \Rightarrow \{b\}\sqcap \exists r_S.\{c\}&\sqsubseteq \bigsqcup _{T\supseteq _c S} \exists r_T.\{c\}&\text {for all } r\in \mathsf {N_{R}} \text { in } \textit{KB}\text { and } a,b,c\in \mathsf {N_I^{\textit{KB}}} \end{aligned}$$
(17)

where S and T are as above. Intuitively, this axiom states that any fact \(r_S(b,c)\) entails some fact of the form \(r_T(b,c)\). Finally, as previously for \(\mathcal {ALCH_{@}}\), we also add all axioms of the form (12) and (13). This finishes our construction of \(\textit{KB}^\ddagger \).

Lemma 1

For any ground \(\mathcal {SROIQ_{@}}\) ontology \(\textit{KB}\), the \(\mathcal {SROIQ} _\approx \) ontology \(\textit{KB}^\ddagger \) is equisatisfiable and can be constructed in exponential time.

The proof is analogous to the proof of Theorem 3 with one exception: when constructing models we do not introduce a fresh, unnamed domain element \(\star \), but rather use \(z^\mathcal {J}\) instead (which may or may not be named).

To complete the proof of Theorem 11, it remains to show that satisfiability checking for the exponentially larger \(\textit{KB}^\ddagger \) can still be done in nondeterministic double exponential time w.r.t. the size of \(\textit{KB}\). To this end, we can define simplicity and regularity for \(\mathcal {SROIQ} _\approx \) as for \(\mathcal {SROIQ_{@}} \), by ignoring the additional \(\approx \)-prefixes and disregarding any condition related to annotations. In particular, we obtain a strict partial order \(\prec \), as before, and, since \(\textit{KB}^\ddagger \) only contains role inclusions translated directly from those in \(\textit{KB}\), it also satisfies the regularity restrictions. We define the \(\circ \) -depth of a regular \(\mathcal {SROIQ} _\approx \) ontology \(\textit{KB}_\approx \) to be the maximal number k for which there is a chain of (inverted) roles \(R_1\prec R_1'\prec \ldots \prec R_k\prec R_k'\), such that \(\textit{KB}_\approx \) contains complex role inclusions with \(R_i\) occurring as one of several roles on the left and \(R_i'\) on the right. Intuitively speaking, the \(\circ \)-depth bounds the number of axioms with \(\circ \) along paths of \(\prec \). Clearly, the \(\circ \)-depth of \(\textit{KB}^\ddagger \) is the same as for \(\textit{KB}\), in spite of the exponential increase in the number of axioms.

Lemma 2

Checking satisfiability of a \(\mathcal {SROIQ} _\approx \) ontology \(\textit{KB}_\approx \) of size s and \(\circ \)-depth d is possible in NTIME \((2^{p(s\cdot 2^{q(d)})})\), where p, q are some fixed polynomial functions.

In particular, if an ontology is of size \(O(2^n)\) but retains a \(\circ \)-depth in O(n), then reasoning is still in N2ExpTime. To show this, we adapt the translation from \(\mathcal {SROIQ}\) to \(\mathcal {SHOIQ}\) as given by Kazakov [12], which is based on representing the effects of complex role inclusion axioms using concept inclusions. As a first step, one constructs, for any non-simple role expression R, a nondeterministic finite automaton \(\mathcal {B}_R\) that describes the regular language of all sequences of roles that entail R [10]. We modify the known construction for \(\mathcal {SROIQ} _\approx \) by allowing transitions in this automaton to be labelled not just by role expressions S, but also by conditional expressions \(a\approx b\Rightarrow S\). The idea is that these transitions are only available if the precondition holds. By a slight adaptation of a similar observation of Horrocks and Sattler [11, Lemma 11], we obtain:

Lemma 3

For a \(\mathcal {SROIQ} _\approx \) ontology \(\textit{KB}_\approx \) and a role expression R, the size of \(\mathcal {B}_R\) is bounded exponentially in the \(\circ \)-depth of \(\textit{KB}_\approx \).

Kazakov considers a normal form of axioms, which we can construct analogously for \(\mathcal {SROIQ} _\approx \) [12, Table 1]. We can ensure that conditions \(a\approx b\) occur in concept inclusions only if they have the form \(a\approx b\Rightarrow A\sqsubseteq B\) with \(A,B\in \mathsf {N_{C}} \). The automaton \(\mathcal {B}(R)\) is then used to replace every axiom of the form \(A\sqsubseteq \forall R.B\) (which never has \(\approx \)-conditions) by the following axioms:

$$\begin{aligned} A&\sqsubseteq A^R_q&q \text { starting state of } \mathcal {B}(R) \end{aligned}$$
(18)
$$\begin{aligned} a\approx b\Rightarrow A^R_{q_1}&\sqsubseteq \forall S.A^R_{q_2}&q_1\mathop {\rightarrow }\limits ^{a\approx b\Rightarrow S}q_2 \text { a transition of } \mathcal {B}(R) \end{aligned}$$
(19)
$$\begin{aligned} A^R_q&\sqsubseteq B&q \text { a final state of } \mathcal {B}(R) \end{aligned}$$
(20)

where the condition \(a\approx b\) in axioms (19) can be omitted if it is not given. The resulting \(\mathcal {SROIQ} _\approx \) ontology still contains axioms with preconditions \(a\approx b\), but no more \(\circ \). Every normalised \(\mathcal {SROIQ} \) axiom \(\alpha \) can be translated into a \(\mathcal {C}^2\) formula \(\mathsf {c2}(\alpha )\) as shown in [12, Table 1]. A \(\mathcal {SROIQ} _\approx \) axiom of the form \(a\approx b\Rightarrow \alpha \) accordingly can be translated as \((\exists ^{=1}x.A_a(x)\wedge A_b(x))\rightarrow \mathsf {c2}(\alpha )\). This completes the proof of Theorem 11.

We can lift this result to non-ground ontologies without increasing complexity:

Theorem 12

Satisfiability of \(\mathcal {SROIQ_{@}}\) ontologies is N2ExpTime-complete.

Proof

Hardness is immediate given the hardness of \(\mathcal {SROIQ}\). The proof of membership uses the same grounding approach as the proof of Theorem 4, which is easily seen to be correct. This grounded ontology \(\textsf {ground}(\textit{KB})\) is exponentially larger than the input \(\textit{KB}\), but the regularity conditions for \(\mathcal {SROIQ_{@}}\) ensure that it has the same (linearly bounded) \(\circ \)-depth. Moreover, while the transformation used for axiomatising ground \(\mathcal {SROIQ_{@}}\) ontologies is also exponential, it is polynomial in the number of possible ground annotation sets; this number remains single exponential w.r.t. the size of \(\textit{KB}\), even when considering \(\textsf {ground}(\textit{KB})\). Therefore, we find that the auxiliary \(\mathcal {SROIQ} _\approx \) ontology \(\textsf {ground}(\textit{KB})^\ddagger \) is still only exponential w.r.t. \(\textit{KB}\) while having a polynomial \(\circ \)-depth. The claimed complexity therefore follows from Lemma 2.    \(\square \)

7 Conclusion

Current graph-based knowledge representation formalisms suffer from an inability to handle meta-data in the form of sets of attribute–value pairs. These limitations show up even when dealing with purely abstract data and are orthogonal to datatype support in the formalisms. We therefore believe that KR formalisms must urgently take up the challenge of incorporating annotation structures into their expressive repertoire.

Our family of attributed description logics represents a potential solution in the context of DLs, and covers attributed \(\mathcal {SROIQ}\), the DL underlying OWL 2 DL. In contrast to our recent findings on rule-based logics supporting similar annotations, attributed DLs often incur an increased reasoning complexity due to the open-world nature of DLs. We have presented a grounding-based decision procedure and identified the special cases of ground ontologies and structural restrictions on axioms, for which this overhead can be avoided. In particular, this ensures the tractability of attributed \(\mathcal {EL}\).

More work is now needed regarding practical reasoning algorithms in attributed DLs. We believe that similar approaches to those used for reasoning with nominal schemas might be effective here. A related practical issue is the syntactic integration of the new features in OWL. The existing annotation mechanism of OWL 2 [17] can be used to store attribute-value sets, e.g., of assertions, but is not general enough to capture our extended syntax for arbitrary axioms. Finally, there are certainly many further expressive mechanisms related to modelling with annotations that should be considered and investigated in future studies of this new field.