Re: test cases for fromUnionQuery, please from Eric Prud'hommeaux on 2005-06-06 (public-rdf-dawg@w3.org from April to June 2005)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Mon, 6 Jun 2005 01:27:26 -0400
To: "Seaborne, Andy" <andy.seaborne@hp.com>
Cc: Dan Connolly <connolly@w3.org>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-ID: <20050606052726.GA7441@w3.org>
On Fri, Jun 03, 2005 at 06:02:48PM +0100, Seaborne, Andy wrote:
> 
> 
> 
> Eric Prud'hommeaux wrote:
> >On Wed, Jun 01, 2005 at 03:51:19PM +0100, Seaborne, Andy wrote:
> >
> >>
> >>
> >>Dan Connolly wrote:
> >><snip/>
> >>
> >>>Also, Andy, can you take TimBL's comment
> >>>http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2004Nov/0020.html
> >>>
> >>>and extract a test case that shows how the difference
> >>>between the 12 Oct design and what TimBL is asking for?
> >>
> >>In the Oct 12 design, expresses in current language, the default graph 
> >>(Tim's
> >>default KB) is the RDF merge of the named graphs.
> >>
> >>Tim made two points:
> >>
> >>1/ There are many ways to create graphs - merge is just one of them.  His
> >>example 0 show one that is the OWL closure of the merge.  This seems quite
> >>timely again with possible rules work in the air.
> >>
> >>2/ That the default to believing (Tim's word) all RDF and the merge of 
> >>the RDF
> >>is dangerous.  The example 2 is FOAF based.
> >>
> >>
> >>Example 0:
> >>
> >>G1.ttl:
> >>:mary foaf:phone   "1234" .
> >>
> >>G2.ttl
> >>:mary owl:sameAs :maryJ .
> >>
> >>
> >>then
> >>
> >>   SELECT ?phone WHERE { :maryJ foaf:phone ?phone }
> >>
> >>gives :maryJ has phone "1234" yet no graph made this claim.  G2 can 
> >>injected by a third party.
> >>
> >>------------------------------
> >>
> >>Suppose we have two graphs that have been got through crawling.  One 
> >>happens to
> >>be out of date but that's the nature of FOAF:
> >>
> >>Named graph <G1>
> >>_:x foaf:name "Alice" .
> >>_:x foaf:mbox <mailto:alice@example.org> .
> >>_:x diet:preference "Vegetarian" .
> >>
> >>Named graph <G2>
> >>_:y foaf:name "Alice" .
> >>_:y foaf:mbox <mailto:alice@example.org> .
> >>_:y diet:preference "Vegan" .
> >>
> >>Let's try checking dietary preferences:
> >>
> >>ASK { [ foaf:name "Alice" ; diet:preference "Vegan" ] }
> >>
> >>=> (Oct 12 design) Yes
> >>
> >>=> (Tim asks for) No
> >
> >
> >Calling them "named graphs" above somewhat pre-decides the answer.
> >I think it's interesting to compare the respective expressivities
> >of both languages.  Both proposals allow you to ask either question.
> 
> Not really.
> 
> The test case is what happen when there is no description of the dataset 
> provided.  Can the publisher make available, as named graphs, some RDF 
> without that always being in the default graph.  The Oct-12 design says "no 
> you can't".
> 
> The current design can express a wider range of datasets than the Oct-12 
> design. It can express the setup of the Oct-12 design if the publisher 
>    wishes to.
> 
> >
> >In Source (12-Oct) parlance:
> >
> >  ASK
> > LOAD <G1> <G2>
> >WHERE { [ foaf:name "Alice" ; diet:preference "Vegan" ] }
> >=> Yes
> >
> >  ASK
> > LOAD <G1> <G2>
> >WHERE { GRAPH ?g { [ foaf:name "Alice" ; diet:preference "Vegan" ] } .
> >        FILTER ?g != <G1> && ?g != <G2> }
> >=> No
> 
> But this isn't the query in the test case because it has added the fact 
> that the client knows that there are named graphs present.

I think this test case is breaking down because it doesn't include the
LOAD and LOAD INTOs in the NamedGraph queries. My interpretation of
why the NamedGraphs <G1> and <G2> were in the dataset may not match
yours.

I think this test case is breaking down because it doesn't include the
LOAD and LOAD INTOs in the NamedGraph queries. My interpretation of
why the NamedGraphs <G1> and <G2> were in the dataset may not match
yours.

>                                                             It is when a 
> query is sent to a service that is executing against the daatset as 
> provided - does the client have to discover that the information in the 
> default graph is also in a named graph before it can understand the results 
> from a query which use GRAPH at all.
                                     ?   (assume that was a question)

No more than the client would have to with NamedGraphs.

Using the use cases in
  <http://www.w3.org/mid/20050603095342.GL1967@w3.org>
I've looked at two parameters:
  (TRUSTED) whether the default/aggregate graph is known to be
  trusted.
  (LOAD INTO) whether the query loads untrusted graphs.

-TRUSTED -LOAD INTO:
-TRUSTED +LOAD INTO:
Both NamedGraphs and Source must explicitly list the trusted sources
of info.

+TRUSTED -LOAD INTO:
Both NamedGraphs and Source do a query with no provenance constraints.

+TRUSTED +LOAD INTO:

If there is a single trust domain, NamedGraphs can take advantage of
the trusted default graph while Source must explicitly construct
constraints ruling out the arguments to LOAD INTO.


These all model the goals as ending when the query ends. That is, a
LOAD should have no persistent side effects* (especially in a trusted
graph). If it does have persistent side effects, it may clarify our
thinking to consider it an update protocol.


* I plan to weasel around this rule in a semantic aggregator app with
the excuse "You can't know that I hadn't already loaded that graph."


> >and in Name Graphs parlance:
> >
> >  ASK
> > LOAD <G1> <G2>
> >WHERE { [ foaf:name "Alice" ; diet:preference "Vegan" ] }
> >=> Yes
> >
> >  ASK
> > LOAD INTO <G1> <G2>
> >WHERE { GRAPH ?g { [ foaf:name "Alice" ; diet:preference "Vegan" ] } }
> >=> No
> >
> >
> >>Tim as asking that this return nothing unless the publisher has decided 
> >>to put that information in the default graph.  The publisher is to be 
> >>made responsible for claims in the default graph just like a plain graph 
> >>and HTTP GET.
> >>
> >>	Andy
> >
> >
> >That's a very interesting point. In Source, the querier needs to be
> >able to specify a trust domain, maybe by enumerating a specific set
> >of inclusions or exclusions, or maybe something more complicated
> >involving data in some trusted document and some logical connection
> >expressible in a FILTER.
> >
> >The effect of having a LOAD INTO that specifically does *not* affect
> >the default graph (contrary to my aggregator needs),
> 
> In the Oct12 design there was no way to have a named graph without the 
> triples also appearing in the default graph.
> 
> 	Andy
> 
> > is that queriers
> >that use no LOAD directives get a default graph that could have some
> >sort of endorsed status. By default, the user is basically protected
> >from the data that they loaded, whereas in Source, they would have to
> >specifically enumerate certain documents in or out.
> >
> >Tim's use case was pretty sketchy; I'm curious about practical use
> >cases of this (the use cases that I have experience with don't need
> >it). Maybe social environment is part of these use cases.
> >
> >
> >><snip/>
> >
> >
> 

-- 
-eric

office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
                        Shonan Fujisawa Campus, Keio University,
                        5322 Endo, Fujisawa, Kanagawa 252-8520
                        JAPAN
        +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +81.90.6533.3882

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.
Received on Monday, 6 June 2005 05:27:26 UTC