- From: Guus Schreiber <guus.schreiber@vu.nl>
- Date: Thu, 05 Jan 2012 15:13:32 +0100
- To: Sandro Hawke <sandro@w3.org>
- CC: public-rdf-wg <public-rdf-wg@w3.org>
On 04-01-2012 19:45, Sandro Hawke wrote: > While it's fresh in my mind, let me write down the view I came to during > today's telecon. (And, carry it a bit farther.) Guus, I don't know if > you still want to write up your understanding of it, or if this obviates > your action. Sandro, Very clear, thanks for this (and I indeed see no more need for my action). I understand why you like third solution. However, it means we have to come up with a name-to-graph relation vocabulary. I don't think we're in a a position to standardize that. Also, it means we have to introduce new syntax and thus invalidate/deprecate/mark as archaic/... the current quad stores out there. That would not be in the spirit of our charter. The first solution (your TriG/REST) has the advantage of being the most conservative extension, i.e. providing a hook for explicating name-to-graph semantics without enforcing it. We can write a non-normative section/appendix/note with suggested practices for vocabulary to be used there. If we can get consensus on (some variant of) the first solution, I see us moving on quickly. The path forward would be to continue writing down further use-case examples. I have asked Antoine Isaac to do this for the Europeana data model [1]. Guus [1] http://www.europeana-libraries.eu/web/europeana-project/technicaldocuments/ > > > * Use Case 1: (presented by cygri at 21 Dec meeting) > > Several systems want to use the data gathered by one RDF crawler. They > don't need simultaneous access to older versions of the data. > > Solution A: use TriG or N-Quads with the fourth column (graph label) > being the URL the content was fetched from. > > <http://example.org> { ... triples recently fetched from there } > > * Use Case 2: (brought up in questions by sandro at 21 Dec meeting) > > Several systems want to use the data gathered by one RDF crawler. They > need simultaneous access to older versions of the data. > > Solution B: use TriG or N-Quads with the fourth column being some > identifier created at the time the retrieval was done. Then, some other > data connects that identifier with the URL the content was fetched from. > > <http://crawler.example.org/r8571> { ... triples fetched in retrieval 8671 } > { > <http://crawler.example.org/r8571> eg:source<http://example.org>; > eg:date "2011-01-04T00:03:11"^^xs:dateTime > } > > * Use Case 3: (suggested by sandro at 4 Jan meeting) > > A system wants to convey to another system in RDF that some person > agrees with or disagrees with certain RDF triples. > > Solution C: use TriG or N-Quads with the fourth column being an > identifier for an RDF Graph (g-snap), so that it can be referred to in > the default graph. > > { eg:sandro eg:endorses<g1> } > <g1> { ... the triples I'm endorsing ... } > > ==== > > So, here we have two different semantics for TriG clearly motivated and > expressed. The TriG document: > > g { s p o } > > is understood in Solution A to mean (in N3): > > g log:semantics { s p o }. # TriG "REST" semantics > > but in Solution C it means (in N3): > > g owl:sameAs { s p o }. # TriG "Equality" semantics > > ==== > > It looks like it's possible to solve all three uses cases with either > semantics, although it gets a bit tricky. > > With TriG/REST: > > UC1 -- as reported by Richard; the URL used by the crawler is > the fourth column URL > > UC2 -- as implemented in Sandro's semwalker code; the crawler > makes a new URL in its own web space, mirrors the content there, > and puts that URL in the fourth column > > UC3 -- rather than endorsing an RDF Graph, I endorse a Graph > Container on the condition that it never changes (or something > like that -- needs to be fleshed out more). > > { eg:sandro eg:endorses<g1>. > <g1> a rdf:StaticGraphContainer. > } > <g1> { ... the triples I'm endorsing ... } > > > With TriG/Equality: > > UC1 -- A layer of indirection is needed, as new URIs need to be > created for the different RDF Graphs. > > {<http://example.org> rdf:graphState<uuid:nnnnn> } > uuid:nnnnn { ... triples fetched from example.org } > > Maybe there's some clever way to do it without this, involving > URL mangling or something to eliminate the second lookup. > > I used uuid:nnnnn as a URI for the RDF Graph, but I could just > as easily have used a hash of the graph or graph serialization. > I *could* use an http URL, I think, but that's likely to lead to > confusion and breakage, especially when someone gets the bright > idea of changing what triples are served at that address. (I'm > sure it will have seemed like a good idea at the time.) > > UC2 -- Pretty straightforward, since we already have that layer of > indirection in UC1. We can't quite use the r8571 example as is, because graph > equality could smoosh the two retrieval operations together. So we > need something like this: > > <uuid:nnnnn> { ... triples fetched in operation 8671 } > { > [ a eg:Retrieval; > eg:gotGraph<uuid:nnnnn>; > eg:source<http://example.org>; > eg:date "2011-01-04T00:03:11"^^xs:dateTime; > ] > } > > UC3 -- easy: > > { eg:sandro eg:endorses<uuid:nnnnn>. } > <uuid:nnnnn> { ... the triples I'm endorsing ... } > > Here you can see why I want a blank node as the graph > label, rather than making up uuids. > > Between these two, I have a preference for TriG/REST over > TriG/Equality, I think. I think people are too likely to get the > semantics of TriG/Equality wrong in practice. Of course, spelling out > the semantics of TriG/REST will be a tricky given it has a some > contextual qualities, as we've discussed. > > MEANWHILE, we have a third solution, where we name the relation > explicitly. This is the one I prefer. > > UC1 > > <http://example.org> rdf:graphState { ... triples recently fetched from there } > > UC2 -- either of the styles given above, depending whether the > harvester wants to publish its copies on the web or not. > > UC3 > > eg:sandro eg:endorses<uuid:nnnnn>. > <uuid:nnnnn> owl:sameAs { ... the triples I'm endorsing ... } > > or, logically: > > eg:sandro eg:endorses { ... the triples I'm endorsing ... } > > (Then, I would probably get rid of the curly braces around the default > graph, so it becomes Turtle with Nesting.) > > Do those three solution designs make sense? Any strong preferences > among them? Are there more use cases that people think the group will > find compelling and which cannot be solved by all three of these > solutions? (I think the next use case I'd approach would be "Tracing > Inference Results", mostly because it motivates shared blank nodes. > But I'm out of time for today.) > > -- Sandro > > > > > > >
Received on Thursday, 5 January 2012 14:14:01 UTC