DIG 2.0 Proposal for a Query Interface

This version:
http://www.sts.tu-harburg.de/~al.kaplunova/dig-query-interface.html
Last change: 
2007/01/18 
Authors:
Alissa Kaplunova, Hamburg University of Technology
Ralf Möller, Hamburg University of Technology

Abstract

A proposal for an interface for ABox queries as part of DIG 2.0.


Motivation

In many practical application systems based on DLs, a powerful ABox query language is one of the main requirements.  Answering queries in DL systems goes beyond query answering in relational databases. In databases, query answering amounts to model checking (a database instance is seen as a model of the conceptual schema). Query answering w.r.t. TBoxes and ABoxes must take all models into account, and thus requires deduction. The aim is to define expressive but decidable query languages. 

Well known classes of queries such as conjunctive queries and unions of conjunctive queries are topics of current investigations in this context.  In the literature, two different semantics for these kinds of queries are discussed. In standard conjunctive queries, variables are bound to (possibly anonymous) domain objects. In so-called grounded conjunctive queries, variables are bound to named domain objects (object constants). However, in grounded conjunctive queries the standard semantics can be obtained for so-called tree-shape queries by using existential restrictions in query atoms. 

For an ABox query language as part of DIG 2.0 we do not commit to a certain semantics. The semantics depends on the reasoner. The standard semantics is implemented in the system QuOnto [QuOnto] (for a Description Logic of the DL-Lite family) whereas the grounded semantics is implemented in RacerPro [Wessel and Möller 05], [Wessel and Möller 06], and also KAON2 [KAON2].  For all systems, an interface corresponding to the DIG 2.0 standard will be developed. A possibility to identify the fragment of the query language supported by a reasoner is foreseen in the DIG 2.0 protocol.

DIGDescription Response

If an DIGDescribe request is sent to a DIG server, the server must response with DIGDescription that  contains the name of the DIG server, its version, an optional identification message,  flags for annotations and imports, and a set of names of supported requests. This set also can contain the request element  Retrieve , if the server provides for query answering.  Additionally, the supportsQueryLanguage attribute of the DIGDescription response  can be specified to identify the fragment of a query language supported by the reasoner. Possible values are e.g. "cq" (conjunctive queries), "ucq" (unions of conjunctive queries), "focq" (first order conjunctive queries), "ugcq" (unions of grounded conjunctive queries) and so on. An example for a reasoner description:
<DIGDescription 
name="Racer"
version="1.9.5"
message="Racer running on localhost"
supportsLanguage="SHIQ(D)"
supportsAnnotations="true"
supportsImports="true"
supportsQueryLanguage="ugcq">
<SupportedRequest requestName="Retrieve"/>
<SupportedRequest requestName="..."/>
 ...
</DIGDescription>

Ask Language

A query consists of a head and a body . The head lists variables for which the user would like to compute bindings. The body consists of query atoms (see below) in which all variables from the head must be mentioned. If the body contains additional variables, they are seen as existentially quantified. A query answer is a set of tuples representing bindings for variables mentioned in the head.

Query atoms can be concept query atoms (denoted with ConceptQueryAtom), role query atoms (denoted with RoleQueryAtom), same-as query atoms (denoted with SameAsQueryAtom), different-from query atoms (denoted with DifferentFromQueryAtom) as well as concrete domain query atoms (denoted with ConcreteDomainQueryAtom). The latter are introduced to provide support for querying the concrete domain part of a knowledge base.

The Retrieve request is derived from the RequestToOntology class (which has an ontologyURI parameter). Additionaly, it has an attribute queryID as unique identification of the query (important for iterative query answering and query management) and takes a query head (denoted with QueryHead) and a query body (denoted with QueryBody).  Within the QueryHead tag Abox individuals and variables (denoted as QueryVariable) can be used. Variables are bound to those individuals which satisfy the query. For boolean queries, the head must be empty.

Complex queries are built from query atoms using boolean constructs for conjunction (QueryObjectIntersectionOf), union (QueryObjectUnionOf) and negation (QueryObjectComplementOf) (for the latter, for instance, negation as failure semantics is assumed).  Concept query atoms consist of variables (or individuals) and complex concept expressions. Role query atoms consists of at least two identifiers for variables (or indidivuals) followed by a role expression.

The following conjunctive query with id q1 asks for all individuals of the concept woman which have female children. The requested knowledge base is identified by means of a URI:

 <Retrieve ontologyURI="myOntology.owl" queryID="q1">
<QueryHead>
<QueryVariable URI="#x"/>
</QueryHead>
<QueryBody>
<QueryObjectIntersectionOf>
<ConceptQueryAtom>
<QueryVariable URI="#x"/>
<owl:ObjectIntersectionOf>
<owl:OWLClass owl:URI="#woman"/>
<owl:ObjectSomeValuesFrom>
<owl:ObjectProperty owl:URI="#hasChild"/>
<owl:OWLClass owl:URI="#female"/>
</owl:ObjectSomeValuesFrom>
</owl:ObjectIntersectionOf>
</ConceptQueryAtom>
</QueryObjectIntersectionOf>
</QueryBody>
</Retrieve>
The following query q2 consists of conjunction of a concept query atom and a role query atom. It returns all mother-child pairs (bound to variables x and y):
<Retrieve ontologyURI="myOntology.owl" queryID="q2">
<QueryHead>
<QueryVariable URI="#x"/>
<QueryVariable URI="#y"/>
</QueryHead>
<QueryBody>
  <QueryObjectIntersectionOf>
<ConceptQueryAtom>
  <QueryVariable URI="#x"/>
<owl:OWLClass owl:URI="#woman"/>
</ConceptQueryAtom>>
<RoleQueryAtom>
<QueryVariable URI="#x"/>
<QueryVariable URI="#y"/>
<owl:ObjectProperty owl:URI="#hasChild"/>
</RoleQueryAtom>
</QueryObjectIntersectionOf>
</QueryBody>
</Retrieve>
The  boolean query q3 asks if there are any individuals of the concept woman in the current ABox:
 <Retrieve ontologyURI="myOntology.owl" queryID="q3">
<QueryHead>
  </QueryHead>
<QueryBody>
<QueryObjectIntersectionOf>
<ConceptQueryAtom>
<QueryVariable URI="#x"/>
<owl:OWLClass owl:URI="#woman"/>
</ConceptQueryAtom>
</QueryObjectIntersectionOf>
</QueryBody>
</Retrieve>
Instead of variables, also ABox individuals can be used in the query, as illustrated by example of the following (boolean) query:
<Retrieve ontologyURI="myOntology.owl" queryID="q4">
<QueryHead>
  </QueryHead>
<QueryBody>
<QueryObjectIntersectionOf>
<ConceptQueryAtom>
<owl:Individual owl:URI="#eve"/>
<owl:OWLClass owl:URI="#woman"/>
</ConceptQueryAtom>
</QueryObjectIntersectionOf>
</QueryBody>
</Retrieve>
As mentioned above, within the Retrieve statement we can build unions of conjunctive queries using the operator QueryObjectUnionOf in front of the conjunctive queries:
 <QueryObjectUnionOf>
<QueryObjectIntersectionOf>
[...]
</QueryObjectIntersectionOf>
[...]
<QueryObjectIntersectionOf>
[...]
</QueryObjectIntersectionOf>
</QueryObjectUnionOf>
The QueryObjectComplementOf operator can be used in front of query body atoms, conjunctive queries and unions of conjunctive queries. For the semantics, see below.
 <QueryObjectComplementOf> 
<QueryObjectUnionOf>
<QueryObjectComplementOf>
<QueryObjectIntersectionOf>
[...]
<QueryObjectComplementOf>
[...]
</QueryObjectComplementOf>
</QueryObjectIntersectionOf>
</QueryObjectComplementOf>
</QueryObjectUnionOf>
</QueryObjectComplementOf>
If a queried reasoner supports retrieving of concrete domain datatype values, this can be done by means of concrete domain query atoms (denoted as ConcreteDomainQueryAtom which consist of variables (or individuals), concrete domain variables (or values) and attributes. We refer to [DIG 2.0 CD Proposal] for details on the concrete domain interface for DIG.  Please note that we distinguish between concrete domain variables used in the query language (denoted as ConcreteDomainQueryVariable) and concrete domain variables (cdvar) used in the tell language. Furthermore, we use the new tag PredicateQueryAtom as well as tags predicate, lambda and op (introduced in [DIG 2.0 CD Proposal]) or just op to pose certain constraints between concrete domain objects (values or variables).

In case of non-functional datatype properties instead of attributes, certain restrictions are imposed on the use of concrete domain query variables. In order to avoid ambiguity we do not allow for concrete domain query variables within the PredicateQueryAtom statement and forbid multiple occurrences of concrete domain query variables in different ConcreteDomainQueryAtom statements.

Let be age a concrete domain attribute of type integer. The following query retrieves all adults as well as their ages:
<Retrieve ontologyURI="myOntology.owl" queryID="q5">
<QueryHead>
  <QueryVariable URI="#x"/>
<ConcreteDomainQueryVariable URI="#y"/>
</QueryHead>
<QueryBody>
<QueryObjectIntersectionOf>
<ConcreteDomainQueryAtom>
<QueryVariable URI="#x"/>
<ConcreteDomainQueryVariable URI="#y"/>
<owl:DataProperty owl:URI="#age"/>
</ConcreteDomainQueryAtom>
<PredicateQueryAtom>
<ConcreteDomainQueryVariable URI="#y"/>
<owl:Constant owl:datatypeURI="&xsd;integer">
18
</owl:Constant>
<op name=">"/>
</PredicateQueryAtom>  
</<QueryObjectIntersectionOf>>
</QueryBody>
</Retrieve>
In the following example, we request for all individuals which have the same age (implicitly expressed by using y twice):
<Retrieve ontologyURI="myOntology.owl" queryID="q6">
<QueryHead>
  <QueryVariable URI="#x"/>
<QueryVariable URI="#z"/>  
<ConcreteDomainQueryVariable URI="#y"/>
</QueryHead>
<QueryBody>
<QueryObjectIntersectionOf>
<ConcreteDomainQueryAtom> 
<QueryVariable URI="#x"/>
  <ConcreteDomainQueryVariable URI="#y"/>
  <owl:DataProperty owl:URI="#age"/>
</ConcreteDomainQueryAtom>
<ConcreteDomainQueryAtom>
<QueryVariable URI="#z"/>
<ConcreteDomainQueryVariable URI="#y"/>
<owl:DataProperty owl:URI="#age"/>
</ConcreteDomainQueryAtom>  
</QueryObjectIntersectionOf>
</QueryBody>
</Retrieve>
A SameAsQueryAtom can be used to enforce a binding of a variable (e.g., <QueryVariable URI="#betty"/>) to an individual (e.g., <owl:Individual owl:URI="#betty"/>), or to enforce that two non-injective variables are bound to the same individual.
<Retrieve ontologyURI="myOntology.owl" queryID="q7">
<QueryHead>
<QueryVariable URI="#x"/>
</QueryHead>
<QueryBody>
<QueryObjectIntersectionOf>
<RoleQueryAtom>
<QueryVariable URI="#x"/>
<QueryVariable URI="#y"/>
<owl:ObjectProperty owl:URI="#loves"/>
</RoleQueryAtom>
<ConceptQueryAtom>
<QueryVariable URI="#y"/>
<owl:OWLClass owl:URI="#human"/>
</ConceptQueryAtom>
<SameAsQueryAtom>
<QueryVariable URI="#x"/>
<QueryVariable URI="#y"/>
</SameAsQueryAtom>
  </QueryObjectIntersectionOf>
</QueryBody>
</Retrieve>
Instead of SameAsQueryAtom one may use DifferentFromQueryAtom to search some human that does not love himself.

Sometimes an explicit projection operator in the query body is required in order to reduce the "dimensionality" of an intermediate tuple set when the query answer is computing. This operator is particulary important in combination with negation (QueryObjectComplementOf). We refer to [Wessel and Möller 05], [Wessel and Möller 06] for the semantics and a motivating example. For DIG we propose a tag QueryProject which can be used in any position within a query body and contains a head and a body parts.

The following query returns all mothers which do not have a known (i.e. explicitly modeled in an ABox) child:

<Retrieve ontologyURI="myOntology.owl" queryID="q8">
<QueryHead>
<QueryVariable URI="#x"/>
</QueryHead>
<QueryBody>
<QueryObjectIntersectionOf>
<ConceptQueryAtom>
<QueryVariable URI="#x"/>
<owl:OWLClass owl:URI="#mother"/>
</ConceptQueryAtom>
<QueryObjectComplementOf>
<QueryProject>
<QueryHead>
<QueryVariable URI="#x"/>
</Queryhead>
<QueryBody>
<RoleQueryAtom>
<QueryVariable URI="#x"/>
<QueryVariable URI="#y"/>
<owl:ObjectProperty owl:URI="#hasChild"/>
</RoleQueryAtom>
</QueryBody>
</QueryProject>
</QueryObjectComplementOf>
</QueryObjectIntersectionOf>
</QueryBody>
</Retrieve>
In case  a reasoner has to deal  with large result sets for queries, iterative query answering can help to improve performance. Therefore, result sets can be retrieved iteratively using small chunks of tuples. To support the incremental loading the answer tuple by tuple, the statement Retrieve can have an optional attribute ntuples instantiated with the maximum number of tuples which are assumed to be returned. If ntuplesis not specified, all tuples have to be returned.  In addition, DIG 2.0 supports instructions to let a query answering engine compute results "proactively" to provide faster retrieval of subsequent chunks of tuples. This can be achieved by setting the optional mode attribute:
<Retrieve ontologyURI="myOntology.owl" queryID="q9" ntuples=10 mode="proactive">
<QueryHead>
[...]
</QueryHead>
<QueryBody> 
[...]
</QueryBody>
</Retrieve>
The statement Retrieve without head and body is used to retrieve the next tuple(s) for a particular query identified with queryID. The value of queryID can be a query id or an answer set id (asID):
<Retrieve ontologyURI="myOntology.owl" queryID="as9999" ntuples=5/>

Tell Language

In order to tell the reasoner that no more tuples will be requested, we propose the request ReleaseQuery, which has an attribute  queryID, to which the reasoner should respond with the Confirmation response.
<ReleaseQuery queryID="q5"/>

Response Syntax

The QueryAnswers response to a Retrieval request have an attribute queryID which corresponds with the id of the submitted query. The answer set id (asID) will be generated by the reasoner.  The response contains tuples of bindings for variables mentioned in the query head. The response head is the same as the head of the corresponding query.  The head is inserted just for convenience; the reasoner may not reorder components of bindings. For example, the following answer is returned for the query with the id q2 posed above:
 <QueryAnswers queryID="q2" asID="asid1000">
<QueryHead>
<QueryVariable URI="#x"/>
<QueryVariable URI="#y"/>
</QueryHead> 
<Binding>
<owl:Individual owl:URI="#mary"/>
<owl:Individual owl:URI="#betty"/>
</Binding>
<Binding>
<owl:Individual owl:URI="#susan"/>
<owl:Individual owl:URI="#peter"/>
</Binding>
</QueryAnswer>
In case of concrete domain queries, concrete domain variables (and values) can be returned. Therefore, for concrete domain query variables we introduce  ConcreteDomainBinding which consists of a concrete domain variable name and possible of a value (if it can be uniquely determined).  For example, the response for the query q5 can be the following:
<QueryAnswers queryID="q5">
<QueryHead>
  <QueryVariable URI="#x"/>
<ConcreteDomainQueryVariable URI="#y"/>
</QueryHead>
<Binding>
  <owl:Individual owl:URI="#mary"/>
<ConcreteDomainBinding>  
<cdvar name="age_mary"/>  
<owl:Constant owl:datatypeURI="&xsd;integer">
30
</owl:Constant>
</ConcreteDomainBinding>
</Binding>
<Binding>
  <owl:Individual owl:URI="#eve"/>
<ConcreteDomainBinding>  
<cdvar name="age_mary"/>
</ConcreteDomainBinding>
</Binding>
</QueryAnswers>
The response to a boolean query is a set of an empty set of bindings for true or an empty set for false .
True:
<QueryAnswers queryID="...">
<QueryHead>
[...]
</QueryHead>
<Binding>
</Binding>
</QueryAnswers>

False:
<QueryAnswers queryID="...">
<QueryHead>
[...]
</QueryHead>
</QueryAnswers>

Within the bindings statement, the tag notifier can be used for reasoner-specific messages. E.g., considering an incremental query answering, the reasoner can report that the given tuple is the last one:

<QueryAnswers queryID="...">
<QueryHead>
[...]
</QueryHead>
<Binding>
[...]
</Binding>
<notifier message="last tuple"/>
</QueryAnswers>

Other messages are possible. For instance, a reasoner could indicate that subsequent requests for tuple will require considerably more resources. Motivated by this example, a reasoner might decide to return fewer tuples than requested with the attribute ntuples (see above).


[QuOnto]
A. Acciarri, D. Calvanese, G. D. Giacomo, D. Lembo, M. Lenzerini, M. Palmieri, and R. Rosat. QuOnto: Querying Ontologies. in Proc. AAAI, 2005, pp. 1670–1671. QuOnto web page: http://www.dis.uniroma1.it/~quonto/index.htm
[Bechhofer et al. 03]
S. Bechhofer, R. Möller, and P. Crowther. The DIG Description Logic Interface. In Proceedings of the International Workshop on Description Logics (DL-2003), Rome, Italy, September 5-7, 2003.
[DIG1.0]
Sean Bechhofer. The DIG Description Logic Interface: DIG/1.0. http://dl-web.man.ac.uk/dig/2002/10/interface.pdf
[DIG1.1]
Sean Bechhofer. The DIG Description Logic Interface: DIG/1.1. http://dl-web.man.ac.uk/dig/2003/02/interface.pdf
[DIG 2.0 CD Proposal]
Alissa Kaplunova, Ralf Möller. DIG 2.0 Concrete Domain Interface Proposal. http://www.sts.tu-harburg.de/~al.kaplunova/dig-cd-interface.html
[KAON2]
B. Motik and U. Sattler. A Comparison of Reasoning Techniques for Querying Large Description Logic ABoxes. In Miki Hermann and Andrei Voronkov, (editors), Proc. of the 13th Int. Conf. on Logic for Programming Artificial Intelligence and Reasoning (LPAR 2006), (to appear) LNCS, Springer, 2006. KAON2 download page: http://kaon2.semanticweb.org/
[Wessel and Möller 05]
Michael Wessel and Ralf Möller. M. Wessel and R. Möller. A High Performance Semantic Web Query Answering Engine. In I. Horrocks, U. Sattler, and F. Wolter, editors, Proc. International Workshop on Description Logics, 2005. http://www.sts.tu-harburg.de/~r.f.moeller/papers/2005/WeMo05a.pdf RacerPro web page: http://www.racer-systems.com
[Wessel and Möller 06]
Michael Wessel and Ralf Möller. A Flexible DL-based Architecture for Deductive Information Systems. In G. Sutcliffe, R. Schmidt, and S. Schulz, editors, Proc. IJCAR-06 Workshop on Empirically Successful Computerized Reasoning (ESCoR), pages 92-111, 2006. http://www.sts.tu-harburg.de/~r.f.moeller/papers/2006/WeMo06.pdf