Re: querying higher logic with graph query from Martin Duerst on 2004-01-26 (www-rdf-rules@w3.org from January 2004)

From: Martin Duerst <duerst@w3.org>
Date: Mon, 26 Jan 2004 10:35:40 -0500
To: Ian Horrocks <horrocks@cs.man.ac.uk>
Cc: "Eric Prud'hommeaux" <eric@w3.org>, www-rdf-rules@w3.org
Message-Id: <4.2.0.58.J.20040126101620.02b995e8@localhost>
At 09:40 04/01/26 +0000, Ian Horrocks wrote:

>On January 25, Martin Duerst writes:
> >
> > At 10:58 04/01/22 +0000, Ian Horrocks wrote:

> > >Is it the intention
> > >that the semantics of query answers would depend on the semantics of
> > >the system being queried, and would not be considered by the proposed
> > >working group?
> >
> > I think it may not be one, nor the other. I think some basic
> > semantics have to be defined, otherwise there is no interoperability.
> > However, I also think that a lot of differences can be absorbed
> > by the fact that we are dealing with querying. The system queried
> > should produce consistent results (as long as there are no changes
> > to the data), but what these results are depends on the data stored.
> > Some systems may e.g. store data in a form that allows them to
> > simulate an infinite number of triples, and so on.
>
>I don't know what you mean by consistent in this context, but this
>already seems contentious if it means, e.g., that a set of results
>must always be returned in the same order.

Good example. That's why I was saying that some basic semantics
need to be defined. If, as in XML, order is important, then
consistency would mean that results get returned in the same
order. If, as in RDF, order is not important, then consistency
would mean that results are still consistent even if they are
not returned in the same order.


> > 'return all the parents of John' I guess would typically be formulated
> > as 'return all the triples where X is either an object or a subject,
> > and X is a parent of John'. In that case, bnodes should do the job.
> > If there is an infinite number of answers, and all of them are
> > inherently interesting, then I don't see the problem of asking for
> > them. Implementations should be able to deal with such things
> > (in the sense at least that they use some implementation-defined
> > limits to avoid what becomes a denial-of-service issue). This
> > is not much different from many other technologies: Web pages
> > can be of 'infinite' length (easy to do with a cgi script),
> > programs can go into infinite loops, big databases can give back
> > virtually 'infinite' results, and so on. Of course, if the question
> > 'how many' in an important use case, then we might get to
> > something that can answer 'an infinite number'. As you say,
> > some systems may be able to deduce that (in finite time),
> > and some users may be interested in such an answer.
>
>The size of the answer isn't the only issue - there is alos the
>question of whether it makes sense to return multiple (never mind
>infinite) bnodes in an answer given that the semantics of bnodes means
>that every such answer is in some sense the same.

If it's just the bnodes themselves that are returned, then that
doesn't look like it makes much sense. But my guess is that in
many cases, this would just be the result of a badly thought-through
query, or lack of relevant data, rather than a problem in the overall
system. But maybe you can think of cases where that's not the
case.

On the other hand, returning bnodes with properties and values attached
to them makes a lot of sense and may in many cases be exactly what
is expected. Also, having the same bnode show up twice may be
important, because it shows how the data is connected.

In addition, bnodes are often used in certain modeling techniques,
e.g. to simulate the lack of properties on literals.


>Moreover, (possibly infinite) anonymous answers is only one example of
>the general problem. Another example is the case where the system can
>deduce that the answer to a query is either "Peter" or "John", but has
>no way to determine which one. The point is that simply returning
>instantiated graphs may be too weak to capture query responses for
>more expressive languages.

Again, very good material for a use case document.

I'm purely guessing here, but my answer would be as follows:
If this 'more expressive language' is expressed in RDF, then
it should be possible to again express the result in an RDF
graph. If you are able, in this language, to express the
'fact' that some property is either "Peter" or "John", then
you should also be able to express that the result of a query
is either "Peter" or "John". The alternative construct in
RDF already does this, or may come close.

It may be that there are languages where certain things can
be deduced, but not expressed; if this is the case, it would
be good to get examples.


Regards,  Martin.
Received on Monday, 26 January 2004 10:35:59 UTC