- From: Sandro Hawke <sandro@w3.org>
- Date: Wed, 04 Jan 2012 13:45:08 -0500
- To: public-rdf-wg <public-rdf-wg@w3.org>
While it's fresh in my mind, let me write down the view I came to during today's telecon. (And, carry it a bit farther.) Guus, I don't know if you still want to write up your understanding of it, or if this obviates your action. * Use Case 1: (presented by cygri at 21 Dec meeting) Several systems want to use the data gathered by one RDF crawler. They don't need simultaneous access to older versions of the data. Solution A: use TriG or N-Quads with the fourth column (graph label) being the URL the content was fetched from. <http://example.org> { ... triples recently fetched from there } * Use Case 2: (brought up in questions by sandro at 21 Dec meeting) Several systems want to use the data gathered by one RDF crawler. They need simultaneous access to older versions of the data. Solution B: use TriG or N-Quads with the fourth column being some identifier created at the time the retrieval was done. Then, some other data connects that identifier with the URL the content was fetched from. <http://crawler.example.org/r8571> { ... triples fetched in retrieval 8671 } { <http://crawler.example.org/r8571> eg:source <http://example.org>; eg:date "2011-01-04T00:03:11"^^xs:dateTime } * Use Case 3: (suggested by sandro at 4 Jan meeting) A system wants to convey to another system in RDF that some person agrees with or disagrees with certain RDF triples. Solution C: use TriG or N-Quads with the fourth column being an identifier for an RDF Graph (g-snap), so that it can be referred to in the default graph. { eg:sandro eg:endorses <g1> } <g1> { ... the triples I'm endorsing ... } ==== So, here we have two different semantics for TriG clearly motivated and expressed. The TriG document: g { s p o } is understood in Solution A to mean (in N3): g log:semantics { s p o }. # TriG "REST" semantics but in Solution C it means (in N3): g owl:sameAs { s p o }. # TriG "Equality" semantics ==== It looks like it's possible to solve all three uses cases with either semantics, although it gets a bit tricky. With TriG/REST: UC1 -- as reported by Richard; the URL used by the crawler is the fourth column URL UC2 -- as implemented in Sandro's semwalker code; the crawler makes a new URL in its own web space, mirrors the content there, and puts that URL in the fourth column UC3 -- rather than endorsing an RDF Graph, I endorse a Graph Container on the condition that it never changes (or something like that -- needs to be fleshed out more). { eg:sandro eg:endorses <g1>. <g1> a rdf:StaticGraphContainer. } <g1> { ... the triples I'm endorsing ... } With TriG/Equality: UC1 -- A layer of indirection is needed, as new URIs need to be created for the different RDF Graphs. { <http://example.org> rdf:graphState <uuid:nnnnn> } uuid:nnnnn { ... triples fetched from example.org } Maybe there's some clever way to do it without this, involving URL mangling or something to eliminate the second lookup. I used uuid:nnnnn as a URI for the RDF Graph, but I could just as easily have used a hash of the graph or graph serialization. I *could* use an http URL, I think, but that's likely to lead to confusion and breakage, especially when someone gets the bright idea of changing what triples are served at that address. (I'm sure it will have seemed like a good idea at the time.) UC2 -- Pretty straightforward, since we already have that layer of indirection in UC1. We can't quite use the r8571 example as is, because graph equality could smoosh the two retrieval operations together. So we need something like this: <uuid:nnnnn> { ... triples fetched in operation 8671 } { [ a eg:Retrieval; eg:gotGraph <uuid:nnnnn>; eg:source <http://example.org>; eg:date "2011-01-04T00:03:11"^^xs:dateTime; ] } UC3 -- easy: { eg:sandro eg:endorses <uuid:nnnnn>. } <uuid:nnnnn> { ... the triples I'm endorsing ... } Here you can see why I want a blank node as the graph label, rather than making up uuids. Between these two, I have a preference for TriG/REST over TriG/Equality, I think. I think people are too likely to get the semantics of TriG/Equality wrong in practice. Of course, spelling out the semantics of TriG/REST will be a tricky given it has a some contextual qualities, as we've discussed. MEANWHILE, we have a third solution, where we name the relation explicitly. This is the one I prefer. UC1 <http://example.org> rdf:graphState { ... triples recently fetched from there } UC2 -- either of the styles given above, depending whether the harvester wants to publish its copies on the web or not. UC3 eg:sandro eg:endorses <uuid:nnnnn>. <uuid:nnnnn> owl:sameAs { ... the triples I'm endorsing ... } or, logically: eg:sandro eg:endorses { ... the triples I'm endorsing ... } (Then, I would probably get rid of the curly braces around the default graph, so it becomes Turtle with Nesting.) Do those three solution designs make sense? Any strong preferences among them? Are there more use cases that people think the group will find compelling and which cannot be solved by all three of these solutions? (I think the next use case I'd approach would be "Tracing Inference Results", mostly because it motivates shared blank nodes. But I'm out of time for today.) -- Sandro
Received on Wednesday, 4 January 2012 18:47:18 UTC