- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Sat, 08 Oct 2011 17:21:51 +0100
- To: public-rdf-wg@w3.org
On 07/10/11 16:04, Eric Prud'hommeaux wrote: > * Sandro Hawke<sandro@w3.org> [2011-10-07 10:35-0400] >> On Fri, 2011-10-07 at 13:48 +0100, Andy Seaborne wrote: >>>> Okay, that's enough for now. Give me a +1 if you think this is headed >>> > in a useful direction. >>> >>> I like something like this as a pattern of good practice (well, 2 >>> patterns). I don't agree with forcing the 4th column to have a specific >>> meaning given all the other deployed uses we have now collected. >> >> Yeah.... There is a middle ground where some datasets use Web >> semantics and some don't. I see your point that we can't just force >> people to change -- we can't say the thingsthey've been saying now means >> something else. >> >> Maybe we can have a way to flag which datasets are using Web semantics, >> and allow market pressures to work? Like, where we do a new mime type >> for a multigraph syntax, we could add this. And maybe it's something >> we can flag in SPARQL service description. >> >>> On one points: >>> >>> I don't see why >>> >>> <http://example.org> {<s> <p> <o> . } >>> >>> should mean it is ONLY that triple rather than CONTAINS that triple. If >>> the data publisher wants to say "and that's all" then they should say so >>> as an additional fact. The converse of "it's closed by default" is >>> harder to see how to allow it to be open sometimes. >>> >>> For a large graph, and you only need to talk about a small subset, the >>> deployment issues. Consider dbpedia. >>> >>> (I also want to see the same change in TriG for concatenation of files) >> >> It seems to me that it's easy to go from complete to incomplete, just >> using a subgraph predicate. Let's say we want to say G1 is the graph >> with only<s> <p> <o> and G2 is a graph with that triple and maybe other >> stuff. I'd say: >> >> G1 {<s> <p> <o>. } >> { G1 r:subgraphOf G2. } >> >> But I don't see how to communicate G1 the way you're talking about. How >> do you say "and that's all"? > > Imagining Trig used for both update and patch, I see it as specified > by the protocol. CONSTRUCT ?g { ?s ?p ?o } would give me the results > of a query substituted into a named graph pattern. A reply to a GET > would give me a complete resource ("and that's all"). A diff propa- > gation would could look like: > -<G1> { _:s1<p> <o0> } > +<G1> { _:s1<p> <o1> } > which means there were already some<G1> triples and we've only > changed one of them. The use you want to define is, I believe, > characterized by GET<G1>, but I think the mapping of graph > names to sets of triples is useful in other places with other > presumptions of completeness. SPARQL Update allows various ways of treating a change: # if you want "replace", clear the destination first: CLEAR <G1> ; INSERT DATA { GRAPH <G1> { <s> <p> <o> } } or a change: DELETE DATA { GRAPH <G1> { <s> <p> <o0> } } INSERT DATA { GRAPH <G1> { <s> <p> <o1> } } Andy > > >> -- Sandro >> >> >>> Andy >>> >>> On 07/10/11 03:04, Sandro Hawke wrote: >>>> Here's a proposal for what the fourth column should mean. It's kind of >>>> obvious, and I think it's how many of us just assumed Named Graphs were >>>> supposed to work. But I don't think it's been written down in a form >>>> we can use, so here it is, in a first draft. >>>> >>>> I haven't really tried to motivate this, but one thing it does is allow >>>> folks to refer to a graphs using just one URI. As [1] points out rather >>>> painfully, as things stand now, you need multiple URIs just to identify >>>> each g-box (and thus g-snap). (That is, you need to say which sparql >>>> endpoint you're talking about, and then which graph within its >>>> dataset.) >>>> >>>> My starting question was: what is the relationship between the IRI (the >>>> "graph name") and its associated g-snap in an RDF Dataset. This >>>> applies to the dataset backing any SPARQL end point, as well as the >>>> dataset serialized in any multigraph syntax, like TriG or N-Quads. >>>> Another way to look at it: what does it mean to assert a TriG >>>> document? If you send me the TriG Document "<a> {<s> <p> <o> }", and >>>> I trust you, what do I now know? >>>> >>>> Richard, I think, has been arguing for a minimalist position, >>>> answering "nothing", or "it depends on out-of-band agreements". This >>>> "Web Semantics" proposal is an alternative. >>>> >>>> === Proposal >>>> >>>> The idea here is to make the relationship between the URI and the >>>> graph be the standard Web naming relationship, similar to what we all >>>> use for Web pages. When you dereference the URI, you get the graph. >>>> >>>> This has the feature of being, to some extent, observable. Just like >>>> triples are claims about some domain of discourse, quads become claims >>>> about idealized Web dereference behavior. >>>> >>>> Specifically: Consider a "graph naming" to be the association of a >>>> graph name N with a graph G. For the graph naming to hold, every >>>> successful dereference of N yielding an RDF graph must yield G. For a >>>> dataset D to hold, its default graph must hold (as normal in RDF) and >>>> every graph naming pair in D must hold. >>>> >>>> Example 1: This dataset >>>> >>>> <http://example.org> {<s> <p> <o>. } >>>> >>>> means that if anyone is able to dereference "http://example.org" >>>> and obtain an RDF graph serialization, the serialized graph will >>>> consist of the single triple,<s> <p> <o>. Failure to dereference >>>> does not make the graph naming untrue, but a successful dereference >>>> yielding a different graph does. >>>> >>>> Example 2: This dataset can never be true: >>>> >>>> <http://example.org> {<s> <p> 1. } >>>> <HTTP://example.org> {<s> <p> 2. } >>>> >>>> ... since one cannot get different results dereferencing URIs that >>>> differ only in the case of the scheme component (as per RFC 3986). >>>> >>>> Example 3: This dataset: >>>> >>>> <tag:hawke.org,2010-10-06:eg1> {<s> <p> <o>. } >>>> >>>> cannot be tested using Web protocols, since the "tag" URI scheme is >>>> (by design) not dereferenceable. Whether it is true or false cannot >>>> be determined experimentally. >>>> >>>> ==== Temporal Context >>>> >>>> How can we say: >>>> >>>> <http://example.org> {<s> <p> <o>. } >>>> >>>> if we suspect that "http://example.org" might serve some other content >>>> tomorrow? >>>> >>>> The answer is that datasets often need temporal qualification just >>>> like RDF graphs do. It's just like saying in RDF: >>>> >>>> <http://example.org/Alice> foaf:age 25. >>>> >>>> One solution for foaf:age triples is to include triples like: >>>> <> dc:temporal "2011-10-06"^^xs:dateTime. >>>> >>>> and that can be done in datasets as well, using the default graph. >>>> More work is needed on this, but I'm pretty sure this proposal can use >>>> whatever solution people come up with for RDF and doesn't make matters >>>> much worse than they are already. >>>> >>>> ==== Practical Deployment Choices >>>> >>>> Any system which maintains a dataset (eg a sparql endpoint) or >>>> generates multigraph documents like TriG has to do one (or more) of >>>> the following: >>>> >>>> 1. Use new non-dereferenceable graph names. These could be tag or >>>> uuid URIs, or http URIs in your own name space which you choose to >>>> leave 404. >>>> >>>> 2. Use your own dereferenceable graph names, perhaps relative to the >>>> endpoint or TriG document URI. If you do serve RDF content at >>>> those URIs, it MUST be the same content (give or take stated time >>>> lag). >>>> >>>> 3. Use someone else's graph names. Here, the key thing is temporal >>>> metadata. You have to decide what you want (copy once vs >>>> synchronize with what accuracy) and (somehow) share that temporal >>>> metadata. >>>> >>>> >>>> ... >>>> >>>> Okay, that's enough for now. Give me a +1 if you think this is headed >>>> in a useful direction. >>>> >>>> -- Sandro >>>> >>>> [1] http://www.w3.org/2011/prov/wiki/Using_named_graphs_to_model_Accounts >>>> >>>> >>> >>> >> >> >> >
Received on Saturday, 8 October 2011 16:22:34 UTC