Re: New Proposal (6.1) for GRAPHS from Arnaud Le Hors on 2012-04-04 (public-rdf-wg@w3.org from April 2012)

From: Arnaud Le Hors <lehors@us.ibm.com>
Date: Tue, 3 Apr 2012 22:09:31 -0700
To: Sandro Hawke <sandro@w3.org>
Cc: public-rdf-wg <public-rdf-wg@w3.org>
Message-ID: <OF82A71699.7131FDB6-ON882579D6.001AED1D-882579D6.001C571D@us.ibm.com>
Hi Sandro,
I have to say that my expectation was similar to Charles's. I guess it's a 
matter of deciding whether <u1> { <a> <b> <c>  } defines the <u1> graph in 
its entirety, as containing one triple, or merely states that the triple 
<a> <b> <c>  is part of graph <u1>.

I'm not saying it should be the latter rather than the former, just that 
it's not obvious.
See below for more on that.

Sandro Hawke <sandro@w3.org> wrote on 04/02/2012 05:57:13 PM:

> From: Sandro Hawke <sandro@w3.org>
> To: Charles Greer <cgreer@marklogic.com>, 
> Cc: Charles Greer <Charles.Greer@marklogic.com>, public-rdf-wg 
> <public-rdf-wg@w3.org>
> Date: 04/02/2012 05:57 PM
> Subject: Re: New Proposal (6.1) for GRAPHS
> 
> On Mon, 2012-04-02 at 14:00 -0700, Charles Greer wrote:
> > Thanks for responding Sandro.  I think that what I'm finding 
difficult, 
> > or at least a significant departure from RDF as I have understood it 
in 
> > the past, is that this TRIG document
> > 
> > <u1> { <a> <b> <c> . <d> <e> <f> }
> > 
> > is not equivalent to these n-quads:
> > 
> > <a> <b> <c> <u1>.
> > <d> <e> <f> <u1>.
> > 
> > Or rather, you now need a document structure around n-quads as well in 

> > order to provide the context in which rdf knows that these triples, 
and 
> > only these triples, constitute the graph <u1>.
> > 
> > I had previously thought that RDF was a data model that didn't need 
any 
> > notion of 'document' to work.  I'm not sure how another assertion that
> > 
> > { <u1> a rdf:Graph }
> > 
> > can assert the boundaries of <u1> unless either the { } syntax does 
more 
> > than it appears to, or the document is a harder scope boundary than I 
> > would have expected.  If the document has some relationship to scope, 
I 
> > think that should be made explicit.
> 
> Two main points:
> 
> 1.  That rdf:Graph declaration is different thing.  It changes how <u1>
> relates to the graph, but in a semantic (not syntactic) way.  It can be
> in a different document, or deduced by the use of some predicates, or
> known a priori by a data consumer.  Knowing it entitles the consumer to
> see that <u1> actually identifies the graph directly, rather than just
> being a label for the graph.     This might matter if we also know <u1>
> dc:licence ...SomeLicensingTerms....   Is it the graph that's licensed,
> or something else?     There are some use cases that suggests this
> distinction is important, but if it turns out not to be, it's not bad,
> people will just not use rdf:Graph declarations much.
> 
> 2.  Whether or not your trig example and your n-quads example are
> equivalent depends on your reading of n-quads.   This extends to your
> reading of SPARQL as well.     My understanding is people are somewhat
> informal about this, but they generally do expect that once they've seen
> the whole trig file, or the whole n-quads file, or searched the whole
> SPARQL end point, that they've seen all the triples in the graph with
> that name/label.
> 
> As a social test case, we could tell people this SPARQL query is run:
> 
>     SELECT ?s ?p ?o 
>     WHERE GRAPH <http://g1.example.org> { ?s ?p ?o }.
> 
> and that we got three result bindings back: 
> 
>     ?s  ?p  ?o
>     === === ===
>     <a> <b> 1.
>     <a> <b> 2.
>     <a> <b> 3.
> 
> Then we ask them: "According to this query, how many triples are in the
> graph known to that endpoint as 'http://g1.example.org' ?"
> 
> What do you think they'll say?
> 
> I think most folks will say, "Three", even if you ask them to think
> again and be pedantically precise.
> 

I agree that's what they would say but primarily because you said: "in the 
graph known to that endpoint"
This is a critical element which isn't apparent in a mere statement like:

<u1> { <a> <b> <c> . <d> <e> <f> }

Which doesn't say anything about where it comes from and whether it's 
complete or not.

This being said, I can get used to having it the way you suggest. 
Especially when the graph name comes first. If we had: { <a> <b> <c> . <d> 
<e> <f> } <u1> I would think differently.
--
Arnaud  Le Hors - Software Standards Architect - IBM Software Group


> I think that means they're using the complete-graph semantics I'm
> suggesting.  If they were using partial-graph semantics, they'd have to
> say, "Three or more".
> 
> You see what I'm saying?   When we have a complete protocol interaction,
> via SPARQL, or transmitting a trig or n-quad files, I think the usual
> assumption is that *all* the triples in the named graph are being sent,
> not just some of them. 
> 
> I understand sometimes it would be nice to store/transmit just part of
> some named graph.   But, as I discussed in a message a couple of minutes
> ago, I think we have to pick one or the other, and I think the
> complete-graph approach is better.  It's pretty easy to convey partial
> graphs if we define the complete approach.
> 
> (I suppose if we defined the partial-graph approach we could transmit
> complete graphs by transmitting partial graphs and including a
> triple-count as metadata, so you know it's complete.   I guess that
> would work, but it seems to me to be optimizing for the much-less-common
> case.)
> 
> Coming back to:
> 
> > I had previously thought that RDF was a data model that didn't need
> any 
> > notion of 'document' to work. 
> 
> Yeah, it depends what you're doing with it.   There's a lot you can do
> with RDF without paying any attention to what documents particular bits
> of RDF were found in, but I think most of the Graphs use cases involve
> situations where you do need to pay attention to these document
> boundaries. 
> 
> > Thanks for your willingness to understand my points --- I'm sure that 
my 
> > formal language will improve over time.
> 
> It's a long process.   :-)    Interesting, it seems to be helped by
> arguing.
> 
>     -- Sandro
> 
> > 
> > Charles
> > 
> > 
> > 
> > On 04/02/2012 08:36 AM, Sandro Hawke wrote:
> > > On Thu, 2012-03-29 at 09:25 -0700, Charles Greer wrote:
> > >> I really like this solution and it seems to satisfy the use cases
> > >> familiar to me from when I actually worked a lot with RDF in the 
wild.
> > >>
> > >> One thing I'm tripping over though --  The scope of a TRIG document 
or
> > >> RDF dataset in effect 'closes the world.'  Is the idea of "merge" 
only
> > >> within a TRIG document/dataset?
> > >>
> > >> I can only see two ways to really assert a graph literal -- either 
by
> > >> sanctifying the boundaries of  a dataset, thereby making merges 
with
> > >> external data problematic, or by signing bytes.  Am I missing 
something,
> > >> as usual?
> > > There's some misunderstanding here, yes.   Maybe you can talk 
through
> > > some particular thing you imagine doing, involving merging and TriG, 
and
> > > I'll be able to pick it up.   From what you've written, I'm 
confused.
> > >
> > > Maybe I can clarifying by translating this TriG document:
> > >
> > >          <u1>   {<a>   <b>   <c>  }
> > >
> > > into this English declaration:
> > >
> > >          The URI 'u1' denotes something, and that thing has exactly 
one
> > >          associated RDF Graph.   That associated RDF graph consists 
of
> > >          one RDF triple, which we can write in turtle as "<a>  <b> 
<c>".
> > >
> > > So, perhaps it's more clear, now.  If you merged that with another 
TriG
> > > document:
> > >
> > >          <u1>   {<a>   <b>   <d>  }
> > >
> > > Then, trying to accept both documents at onces, you'd be saying:
> > >
> > >          The URI 'u1' denotes something, and that thing has exactly 
one
> > >          associated RDF graph.  In one document that associated 
graph is
> > >          claimed to be the RDF triple "<a>  <b>  <c>", but in 
another
> > >          document that graph is claimed to be the RDF triple "<a> 
<b>
> > >          <d>".
> > >
> > > So, in this case, you can try to merge the documents, but when you 
do,
> > > you find there is a contradiction, since there is only allowed to be 
one
> > > associated graph, but in this case there are two different ones.
> > >
> > >         -- Sandro
> > >
> > >> Charles
> > >>
> > >>
> > >> On 03/27/2012 07:23 PM, Sandro Hawke wrote:
> > >>> I've written up design 6 (originally suggested by Andy) in more
> > >>> detail.  I've called in 6.1 since I've change/added a few details 
that
> > >>> Andy might not agree with.  Eric has started writing up how the 
use
> > >>> cases are addressed by this proposal.
> > >>>
> > >>> This proposal addresses all 15 of our old open issues concerning 
graphs.
> > >>> (I'm sure it will have its own issues, though.)
> > >>>
> > >>> The basic idea is to use trig syntax, and to support the different
> > >>> desired relationships between labels and their graphs via class
> > >>> information on the labels.  In particular, according to this 
proposal,
> > >>> in this trig document:
> > >>>
> > >>>      <u1>   {<a>   <b>   <c>   }
> > >>>
> > >>> ... we only know that<u1>   is some kind of label for the RDF 
Graph<a>
> > >>> <b>   <c>, like today.  However, in his trig document:
> > >>>
> > >>>      {<u2>   a rdf:Graph }
> > >>>      <u2>   {<a>   <b>   <c>   }
> > >>>
> > >>> we know that<u2>   is an rdf:Graph and, what's more, we know 
that<u2>
> > >>> actually is the RDF Graph {<a>   <b>   <c>   }.  That is, in 
> this case, we
> > >>> know that URL "u2" is a name we can use in RDF to refer to that 
g-snap.
> > >>>
> > >>> Details are here: 
http://www.w3.org/2011/rdf-wg/wiki/Graphs_Design_6.1
> > >>>
> > >>> That page includes answers to all the current GRAPHS issues, 
including
> > >>> ISSUE-5, ISSUE-14, etc.
> > >>>
> > >>> Eric has started going through Why Graphs and adding the examples 
as
> > >>> addressed by Proposal 6.1:
> > >>> http://www.w3.org/2011/rdf-wg/wiki/Why_Graphs_6.1
> > >>>
> > >>>        -- Sandro (with Eric nearby)
> > >>>
> > >>>
> > >>
> > >
> > 
> > 
> 
> 
>
Received on Wednesday, 4 April 2012 05:10:08 UTC