Re: querying higher logic with graph query from Martin Duerst on 2004-01-25 (www-rdf-rules@w3.org from January 2004)

From: Martin Duerst <duerst@w3.org>
Date: Sun, 25 Jan 2004 13:18:59 -0500
To: Ian Horrocks <horrocks@cs.man.ac.uk>
Cc: "Eric Prud'hommeaux" <eric@w3.org>, www-rdf-rules@w3.org
Message-Id: <4.2.0.58.J.20040125125640.076a2f40@localhost>

At 10:58 04/01/22 +0000, Ian Horrocks wrote:

>1. Maybe I am misunderstanding, but it seems that what is being
>discussed here is simply a syntax for querying.

Yes, we discussed syntax for querying, and for the results.

>Is it the intention
>that the semantics of query answers would depend on the semantics of
>the system being queried, and would not be considered by the proposed
>working group?

I think it may not be one, nor the other. I think some basic
semantics have to be defined, otherwise there is no interoperability.
However, I also think that a lot of differences can be absorbed
by the fact that we are dealing with querying. The system queried
should produce consistent results (as long as there are no changes
to the data), but what these results are depends on the data stored.
Some systems may e.g. store data in a form that allows them to
simulate an infinite number of triples, and so on.

>2. I can imagine important kinds of query that don't fit so easily
>into the structure that you are proposing, in particular aggregation
>queries such as "how many instances are there of class C?". This
>cannot, in general, be answered simply by counting the instances of C
>that are returned by a "type" query - in OWL ontologies, for example,
>it may be possible to deduce the existence of individuals, and perhaps
>even to count them, without being able to name some or all of them
>(not to mention the fact that trying to answer the query in this way
>could be very costly if the number is large). I'm not saying that one
>couldn't imagine an RDF based syntax for such queries, just that it may
>need to be more complex than simply finding bindings for graph
>fragments.

Collecting use cases with such examples seems to be the right
way to move forward. The question "how many" seems to be a
very good example of an use case.

>3. There are other related issues such as how to answer queries like
>"return all the parents of John". We may know that there are two such
>individuals, and even have additional information about them such as
>their gender, but we may not be able to name them. Simply giving an
>empty answer seems to be disingenuous. Some use of bnodes may be
>possible, but care is required as e.g. an OWL ontology may entail the
>existence of an infinite number of anonymous individuals.

'return all the parents of John' I guess would typically be formulated
as 'return all the triples where X is either an object or a subject,
and X is a parent of John'. In that case, bnodes should do the job.
If there is an infinite number of answers, and all of them are
inherently interesting, then I don't see the problem of asking for
them. Implementations should be able to deal with such things
(in the sense at least that they use some implementation-defined
limits to avoid what becomes a denial-of-service issue). This
is not much different from many other technologies: Web pages
can be of 'infinite' length (easy to do with a cgi script),
programs can go into infinite loops, big databases can give back
virtually 'infinite' results, and so on. Of course, if the question
'how many' in an important use case, then we might get to
something that can answer 'an infinite number'. As you say,
some systems may be able to deduce that (in finite time),
and some users may be interested in such an answer.

Regards,   Martin.

Received on Sunday, 25 January 2004 18:59:11 UTC