More on additional semantic information from Enrico Franconi on 2004-07-12 (public-rdf-dawg@w3.org from July to September 2004)

From: Enrico Franconi <franconi@inf.unibz.it>
Date: Mon, 12 Jul 2004 23:53:05 +0200
To: public-rdf-dawg@w3.org
Message-Id: <DAC15AA0-D44D-11D8-B727-000A9575BDDE@inf.unibz.it>
Hi everybody.
I'd like to make my first contribution in the discussion about the 
handling of additional semantic information to a RDF-based query 
language. I don't wont to enter into the debate of the wording of it 
:-) but I want to give a technical contribution, which may help (or may 
not).

First of all, I'd like to point you to my recent paper [1], where we 
try to give a formal unified description of the different notions of 
rules and queries in the ontological world. I came late into the 
discussion of this group since I wanted first to conclude this unifying 
survey of the literature. What is relevant here is the important border 
(Theorem 3) that divides the querying of a pure database (say, a RDF 
graph or a completed RDFS graph) from the querying of a database given 
a background ontology. We have shown that the divide is quite relevant 
in several aspects:

1) In the case of simple query languages (like conjunctive queries), 
the absence of a background ontology makes all the different semantics 
to queries given in the literature equivalent. This is to be contrasted 
to the case of expressive query languages, and/or of a background 
ontology which requires to choose an appropriate semantics to the 
query. There are at least three different semantics to a query which 
give rise to different answers (axiom-based, logic-programming based, 
autoepistemic based).

2) In the case of simple query languages (like conjunctive queries), 
the absence of a background ontology makes query answering computable 
as model checking: i.e., it is possible to check the validity of the 
query over a *unique* form of completion of the database with some sort 
of "matching" procedure; this is not true in the presence of a 
background ontology (see example below).

The points above are rather technical, but the conclusion to me is that 
we should cross that border with extreme care. Basically, if we 
restrict ourself to a simple query language like the one of conjunctive 
queries, allowing for a non-trivial background ontology (such as, e.g., 
an ontology written in OWL-lite) destroys the "minimal model property" 
of the system. This means that a query answering procedure considering 
only such a minimal model (this would be for example the closure of a 
RDFS graph) can not work anymore, and reasoning procedures should 
involve in a clever way all the models (exponentially or possibly 
infinitely many) of the system: that is deduction wrt model checking. 
Of course this is not impossible, but of course the technologies 
involved will be completely different and they can not be based on some 
sort of "matching", like it happens now for the RDF(S) query answering 
systems.

The way I understand it, the reason why all the work that has been done 
in querying RDF(S) databases could be based on some notion of matching, 
is that it is possible to compute the *unique* completion (i.e., the 
deductive closure) over which then the validity of queries is checked. 
This is not true anymore even for slightly more complex frameworks.

Let me make an example (forgive me the sloppyness in the notation, I am 
not yet used to RDF stuff, I will certainly learn it discussion after 
discussion in this mailing list). This example is a very old example 
(at least 13 year sold) in the DL community; it is called "the little 
house" example.

Ontology:
the class WORKER is declared equivalent to the union of the classes 
EMPLOYEE and MANAGER:
WORKER = EMPLOYEE \or MANAGER

Database:
Paul rdf:type WORKER
Andrea rdf:type WORKER
Simon rdf:type EMPLOYEE
Caroline rdf:type MANAGER
Paul ns:has-friend Andrea
Paul ns:has-friend Simon
Simon ns:has-friend Andrea
Andrea ns:has-friend Caroline

The query:
Tell me the workers having a friend which is an employee, which in turn 
should have a friend which is a manager.
[I don't know the syntax you prefer; in prolog-like notation this would 
be:
  q(X) :- worker(X), has-friend(X,Y), employee(Y), has-friend(Y,Z), 
manager(Z).]

The answer is the set {Paul}.

To understand why Paul satisfies the query, you can first have a visual 
representation of the database (the "little house") - all the edges are 
of type "has-friend":

          Paul:WORKER
          /          \
         /             \
        V               V
Andrea:WORKER <- - Simon:EMPLOYEE
        |
        |
        V
Caroline:MANAGER

If you look for a path WORKER-EMPLOYEE-MANAGER, you don't find one. 
However, you can say that in any possible world Andrea is either a 
manager, or an employee, or both (since he/she is a worker, given the 
ontology). Therefore, in each possible world you can find that the node 
with Paul can be the root of a path WORKER-EMPLOYEE-MANAGER. So, Paul 
is in the answer set of the query.

The main thing that this simple example shows is that it is impossible 
to find a unique completion (a unique deductive closure) over which the 
query can be evaluated. There are *three* incompatible completions, 
i.e., none of them is minimal (i.e., none of them is included in all 
the others).
A real reasoning by cases mechanism should be employed to answer this 
query.

I don't know whether this helps the discussion.

cheers
--e.

Enrico Franconi                  - franconi@inf.unibz.it
Free University of Bozen-Bolzano - http://www.inf.unibz.it/~franconi/
Faculty of Computer Science      - Phone: (+39) 0471-016-120
I-39100 Bozen-Bolzano BZ, Italy  - Fax:   (+39) 0471-016-129

[1] Enrico Franconi and Sergio Tessaris. Rules and Queries with 
Ontologies: a Unified Logical Framework. Workshop on Principles and 
Practice of Semantic Web Reasoning (PPSWR-2004), St. Malo, France, 
2004; supported by the REWERSE and the CoLogNet European Networks of 
Excellence.
<http://www.inf.unibz.it/~franconi/papers/ppswr-04.pdf>
Received on Monday, 12 July 2004 17:55:32 UTC