- From: Arnaud Le Hors <lehors@us.ibm.com>
- Date: Tue, 3 Apr 2012 22:09:31 -0700
- To: Sandro Hawke <sandro@w3.org>
- Cc: public-rdf-wg <public-rdf-wg@w3.org>
- Message-ID: <OF82A71699.7131FDB6-ON882579D6.001AED1D-882579D6.001C571D@us.ibm.com>
Hi Sandro, I have to say that my expectation was similar to Charles's. I guess it's a matter of deciding whether <u1> { <a> <b> <c> } defines the <u1> graph in its entirety, as containing one triple, or merely states that the triple <a> <b> <c> is part of graph <u1>. I'm not saying it should be the latter rather than the former, just that it's not obvious. See below for more on that. Sandro Hawke <sandro@w3.org> wrote on 04/02/2012 05:57:13 PM: > From: Sandro Hawke <sandro@w3.org> > To: Charles Greer <cgreer@marklogic.com>, > Cc: Charles Greer <Charles.Greer@marklogic.com>, public-rdf-wg > <public-rdf-wg@w3.org> > Date: 04/02/2012 05:57 PM > Subject: Re: New Proposal (6.1) for GRAPHS > > On Mon, 2012-04-02 at 14:00 -0700, Charles Greer wrote: > > Thanks for responding Sandro. I think that what I'm finding difficult, > > or at least a significant departure from RDF as I have understood it in > > the past, is that this TRIG document > > > > <u1> { <a> <b> <c> . <d> <e> <f> } > > > > is not equivalent to these n-quads: > > > > <a> <b> <c> <u1>. > > <d> <e> <f> <u1>. > > > > Or rather, you now need a document structure around n-quads as well in > > order to provide the context in which rdf knows that these triples, and > > only these triples, constitute the graph <u1>. > > > > I had previously thought that RDF was a data model that didn't need any > > notion of 'document' to work. I'm not sure how another assertion that > > > > { <u1> a rdf:Graph } > > > > can assert the boundaries of <u1> unless either the { } syntax does more > > than it appears to, or the document is a harder scope boundary than I > > would have expected. If the document has some relationship to scope, I > > think that should be made explicit. > > Two main points: > > 1. That rdf:Graph declaration is different thing. It changes how <u1> > relates to the graph, but in a semantic (not syntactic) way. It can be > in a different document, or deduced by the use of some predicates, or > known a priori by a data consumer. Knowing it entitles the consumer to > see that <u1> actually identifies the graph directly, rather than just > being a label for the graph. This might matter if we also know <u1> > dc:licence ...SomeLicensingTerms.... Is it the graph that's licensed, > or something else? There are some use cases that suggests this > distinction is important, but if it turns out not to be, it's not bad, > people will just not use rdf:Graph declarations much. > > 2. Whether or not your trig example and your n-quads example are > equivalent depends on your reading of n-quads. This extends to your > reading of SPARQL as well. My understanding is people are somewhat > informal about this, but they generally do expect that once they've seen > the whole trig file, or the whole n-quads file, or searched the whole > SPARQL end point, that they've seen all the triples in the graph with > that name/label. > > As a social test case, we could tell people this SPARQL query is run: > > SELECT ?s ?p ?o > WHERE GRAPH <http://g1.example.org> { ?s ?p ?o }. > > and that we got three result bindings back: > > ?s ?p ?o > === === === > <a> <b> 1. > <a> <b> 2. > <a> <b> 3. > > Then we ask them: "According to this query, how many triples are in the > graph known to that endpoint as 'http://g1.example.org' ?" > > What do you think they'll say? > > I think most folks will say, "Three", even if you ask them to think > again and be pedantically precise. > I agree that's what they would say but primarily because you said: "in the graph known to that endpoint" This is a critical element which isn't apparent in a mere statement like: <u1> { <a> <b> <c> . <d> <e> <f> } Which doesn't say anything about where it comes from and whether it's complete or not. This being said, I can get used to having it the way you suggest. Especially when the graph name comes first. If we had: { <a> <b> <c> . <d> <e> <f> } <u1> I would think differently. -- Arnaud Le Hors - Software Standards Architect - IBM Software Group > I think that means they're using the complete-graph semantics I'm > suggesting. If they were using partial-graph semantics, they'd have to > say, "Three or more". > > You see what I'm saying? When we have a complete protocol interaction, > via SPARQL, or transmitting a trig or n-quad files, I think the usual > assumption is that *all* the triples in the named graph are being sent, > not just some of them. > > I understand sometimes it would be nice to store/transmit just part of > some named graph. But, as I discussed in a message a couple of minutes > ago, I think we have to pick one or the other, and I think the > complete-graph approach is better. It's pretty easy to convey partial > graphs if we define the complete approach. > > (I suppose if we defined the partial-graph approach we could transmit > complete graphs by transmitting partial graphs and including a > triple-count as metadata, so you know it's complete. I guess that > would work, but it seems to me to be optimizing for the much-less-common > case.) > > Coming back to: > > > I had previously thought that RDF was a data model that didn't need > any > > notion of 'document' to work. > > Yeah, it depends what you're doing with it. There's a lot you can do > with RDF without paying any attention to what documents particular bits > of RDF were found in, but I think most of the Graphs use cases involve > situations where you do need to pay attention to these document > boundaries. > > > Thanks for your willingness to understand my points --- I'm sure that my > > formal language will improve over time. > > It's a long process. :-) Interesting, it seems to be helped by > arguing. > > -- Sandro > > > > > Charles > > > > > > > > On 04/02/2012 08:36 AM, Sandro Hawke wrote: > > > On Thu, 2012-03-29 at 09:25 -0700, Charles Greer wrote: > > >> I really like this solution and it seems to satisfy the use cases > > >> familiar to me from when I actually worked a lot with RDF in the wild. > > >> > > >> One thing I'm tripping over though -- The scope of a TRIG document or > > >> RDF dataset in effect 'closes the world.' Is the idea of "merge" only > > >> within a TRIG document/dataset? > > >> > > >> I can only see two ways to really assert a graph literal -- either by > > >> sanctifying the boundaries of a dataset, thereby making merges with > > >> external data problematic, or by signing bytes. Am I missing something, > > >> as usual? > > > There's some misunderstanding here, yes. Maybe you can talk through > > > some particular thing you imagine doing, involving merging and TriG, and > > > I'll be able to pick it up. From what you've written, I'm confused. > > > > > > Maybe I can clarifying by translating this TriG document: > > > > > > <u1> {<a> <b> <c> } > > > > > > into this English declaration: > > > > > > The URI 'u1' denotes something, and that thing has exactly one > > > associated RDF Graph. That associated RDF graph consists of > > > one RDF triple, which we can write in turtle as "<a> <b> <c>". > > > > > > So, perhaps it's more clear, now. If you merged that with another TriG > > > document: > > > > > > <u1> {<a> <b> <d> } > > > > > > Then, trying to accept both documents at onces, you'd be saying: > > > > > > The URI 'u1' denotes something, and that thing has exactly one > > > associated RDF graph. In one document that associated graph is > > > claimed to be the RDF triple "<a> <b> <c>", but in another > > > document that graph is claimed to be the RDF triple "<a> <b> > > > <d>". > > > > > > So, in this case, you can try to merge the documents, but when you do, > > > you find there is a contradiction, since there is only allowed to be one > > > associated graph, but in this case there are two different ones. > > > > > > -- Sandro > > > > > >> Charles > > >> > > >> > > >> On 03/27/2012 07:23 PM, Sandro Hawke wrote: > > >>> I've written up design 6 (originally suggested by Andy) in more > > >>> detail. I've called in 6.1 since I've change/added a few details that > > >>> Andy might not agree with. Eric has started writing up how the use > > >>> cases are addressed by this proposal. > > >>> > > >>> This proposal addresses all 15 of our old open issues concerning graphs. > > >>> (I'm sure it will have its own issues, though.) > > >>> > > >>> The basic idea is to use trig syntax, and to support the different > > >>> desired relationships between labels and their graphs via class > > >>> information on the labels. In particular, according to this proposal, > > >>> in this trig document: > > >>> > > >>> <u1> {<a> <b> <c> } > > >>> > > >>> ... we only know that<u1> is some kind of label for the RDF Graph<a> > > >>> <b> <c>, like today. However, in his trig document: > > >>> > > >>> {<u2> a rdf:Graph } > > >>> <u2> {<a> <b> <c> } > > >>> > > >>> we know that<u2> is an rdf:Graph and, what's more, we know that<u2> > > >>> actually is the RDF Graph {<a> <b> <c> }. That is, in > this case, we > > >>> know that URL "u2" is a name we can use in RDF to refer to that g-snap. > > >>> > > >>> Details are here: http://www.w3.org/2011/rdf-wg/wiki/Graphs_Design_6.1 > > >>> > > >>> That page includes answers to all the current GRAPHS issues, including > > >>> ISSUE-5, ISSUE-14, etc. > > >>> > > >>> Eric has started going through Why Graphs and adding the examples as > > >>> addressed by Proposal 6.1: > > >>> http://www.w3.org/2011/rdf-wg/wiki/Why_Graphs_6.1 > > >>> > > >>> -- Sandro (with Eric nearby) > > >>> > > >>> > > >> > > > > > > > > > >
Received on Wednesday, 4 April 2012 05:10:08 UTC