Re: querying higher logic with graph query from Ian Horrocks on 2004-01-26 (www-rdf-rules@w3.org from January 2004)

From: Ian Horrocks <horrocks@cs.man.ac.uk>
Date: Mon, 26 Jan 2004 09:40:49 +0000
To: Martin Duerst <duerst@w3.org>
Cc: "Eric Prud'hommeaux" <eric@w3.org>, www-rdf-rules@w3.org
Message-ID: <16404.57505.775620.589654@merlin.horrocks.net>
On January 25, Martin Duerst writes:
> 
> At 10:58 04/01/22 +0000, Ian Horrocks wrote:
> 
> >1. Maybe I am misunderstanding, but it seems that what is being
> >discussed here is simply a syntax for querying.
> 
> Yes, we discussed syntax for querying, and for the results.
> 
> >Is it the intention
> >that the semantics of query answers would depend on the semantics of
> >the system being queried, and would not be considered by the proposed
> >working group?
> 
> I think it may not be one, nor the other. I think some basic
> semantics have to be defined, otherwise there is no interoperability.
> However, I also think that a lot of differences can be absorbed
> by the fact that we are dealing with querying. The system queried
> should produce consistent results (as long as there are no changes
> to the data), but what these results are depends on the data stored.
> Some systems may e.g. store data in a form that allows them to
> simulate an infinite number of triples, and so on.

I don't know what you mean by consistent in this context, but this
already seems contentious if it means, e.g., that a set of results
must always be returned in the same order.


> >2. I can imagine important kinds of query that don't fit so easily
> >into the structure that you are proposing, in particular aggregation
> >queries such as "how many instances are there of class C?". This
> >cannot, in general, be answered simply by counting the instances of C
> >that are returned by a "type" query - in OWL ontologies, for example,
> >it may be possible to deduce the existence of individuals, and perhaps
> >even to count them, without being able to name some or all of them
> >(not to mention the fact that trying to answer the query in this way
> >could be very costly if the number is large). I'm not saying that one
> >couldn't imagine an RDF based syntax for such queries, just that it may
> >need to be more complex than simply finding bindings for graph
> >fragments.
> 
> Collecting use cases with such examples seems to be the right
> way to move forward. The question "how many" seems to be a
> very good example of an use case.
> 
> 
> >3. There are other related issues such as how to answer queries like
> >"return all the parents of John". We may know that there are two such
> >individuals, and even have additional information about them such as
> >their gender, but we may not be able to name them. Simply giving an
> >empty answer seems to be disingenuous. Some use of bnodes may be
> >possible, but care is required as e.g. an OWL ontology may entail the
> >existence of an infinite number of anonymous individuals.
> 
> 'return all the parents of John' I guess would typically be formulated
> as 'return all the triples where X is either an object or a subject,
> and X is a parent of John'. In that case, bnodes should do the job.
> If there is an infinite number of answers, and all of them are
> inherently interesting, then I don't see the problem of asking for
> them. Implementations should be able to deal with such things
> (in the sense at least that they use some implementation-defined
> limits to avoid what becomes a denial-of-service issue). This
> is not much different from many other technologies: Web pages
> can be of 'infinite' length (easy to do with a cgi script),
> programs can go into infinite loops, big databases can give back
> virtually 'infinite' results, and so on. Of course, if the question
> 'how many' in an important use case, then we might get to
> something that can answer 'an infinite number'. As you say,
> some systems may be able to deduce that (in finite time),
> and some users may be interested in such an answer.

The size of the answer isn't the only issue - there is alos the
question of whether it makes sense to return multiple (never mind
infinite) bnodes in an answer given that the semantics of bnodes means
that every such answer is in some sense the same.

Moreover, (possibly infinite) anonymous answers is only one example of
the general problem. Another example is the case where the system can
deduce that the answer to a query is either "Peter" or "John", but has
no way to determine which one. The point is that simply returning
instantiated graphs may be too weak to capture query responses for
more expressive languages.

Ian

> 
> Regards,   Martin.
Received on Monday, 26 January 2004 04:45:23 UTC