RE: Subgraph results counter-example(s) from Seaborne, Andy on 2004-06-09 (public-rdf-dawg@w3.org from April to June 2004)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Wed, 9 Jun 2004 10:29:26 +0100
To: Rob Shearer <Rob.Shearer@networkinference.com>, public-rdf-dawg@w3.org
Message-ID: <E864E95CB35C1C46B72FEA0626A2E80803615421@0-mail-br1.hpl.hp.com>
Rob,

I now understand the matter better - the "they are connected but by unknown
means" is a strong argument.  I agree with removing the text "that the query
matches" that was in the document.

I think we should avoid "explanation" but I didn't think that was the
intention anyway.  Things I think are important are:

1/ that queries can be considered to execute by RDF graph pattern matching.

This is a building block requirement: it helps with the charter item "2.2
RDF Rules".

2/ one return format where part of the larger queried graph is returned.

This is two things: the need for a "tell me about <thing described by
pattern or just URI>" query and the ability to get results as RDF graphs
which are part of the information in the large, remote knowledge base.

The current text is acceptable to me.  The variant is more explicit about
the derived information.

"""
3.4 Subgraph Results
It must be possible for query results to be returned as a subgraph of the
original queried graph.

Status: Pending.

3.4a Subgraph Results (variant)
It must be possible to select an entailed subgraph of a queried graph, in
which case the query results are an RDF graph.
"""

(it removes the "that the query matched" part which I agree is unhelpful).

Discussion in:
http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJun/0484.html

	Andy


More inline:

-------- Original Message --------
> From: Rob Shearer <>
> Date: 8 June 2004 15:48
> 
> The current requirement says to return the subgraph "that the query
> matches", which suggests that any information used to determine whether
> solutions should be returned is part of the result.
> 
> The most trivial problems occur when the query contains disjunction:
> "find everyone who is friends with Eddie or friends with Jane". If
> someone is friends with both of these people, what results should be
> returned? Either a query processor is required to follow all disjunction
> even thought it's redundant (and return both the triples if they both
> exist), or the results are nondeterministic.

??

As it asked for "find everyone" it is a variable bindings results with
projection to one variable:

SELECT ?x WHERE (?x friend (:Eddie OR :Jane))

(or other flavour of query with explicit union)

I don't view this requirement as having anything to do with explanation so I
wouldn't approach it as a subgraph query.

> 
> The more serious issue is that such a requirement can completely kill
> the use of "supplementary" information about RDF graphs. OWL, SWRL, or
> other languages can encode information which allows you to derive the
> fact that Eddie is friends with Jane. This may be the result of
> interaction between dozens or hundreds of axioms. If some of these
> axioms are encoded in RDF, are we supposed to return them?

But if the query is (Eddie friend ?x) then "Eddie friend Jane" is a
returnable triple.  Its not explanation but derived facts.

> What about
> the supplementary information which is not in RDF? What about such
> information which does have an RDF representation but also has other
> formats, and one of those other formats was used?
> 
> Worse, such other languages can encode disjunction in the language
> itself: one can assert that someone is friends with either Eddie or Jane
> without specifying which one. Now there isn't even a final "result"
> triple than can be returned in the result; *now* do we have to go back
> and try to figure out what bits of supplementary information need to be
> returned?

This is the argument that, for me, hits the spot.  This is problematic
without bArcs and there may well be worse cases as well where bArcs aren't
enough.

Example query:
	<x> ?p <y>

and we can know they are not how (we can't instantiate ?p to an RDF graph
arc let alone a URIref).

If the query is "are <x> and <y> related?" the answer is "yes"
If the query is "how are <x> and <y> related?" the answer is "unknown"

> 
> This also clashes with a few other objectives, including the querying
> for non-existent results. (Does *every* triple affect this?)
> 
> The problem is that this requirement as written introduces the notion of
> "explanation", which is actually a *very* hard problem in general, and
> is almost always more information than the user really wanted.
Received on Wednesday, 9 June 2004 05:31:52 UTC