- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Thu, 05 Jan 2012 14:49:01 +0000
- To: public-rdf-wg@w3.org
On 05/01/12 11:09, Andy Seaborne wrote: > On 04/01/12 19:23, David Wood wrote: >> Thanks, Sandro. That's very helpful. >> >> It might be useful to consider augmenting TriG syntax to support your >> third solution (explicitly naming relations). I'd be quite happy with >> that. > > What would the data model be? > >> We could also consider standardizing the existing TriG syntax to be a >> syntactic shorthand for TriG REST semantics; that is, a lack of >> explicitly declared relation infers log:semantics. > > I think we should not fix a semantics for undeclared relationships. > > Otherwise, it invalidates existing TriG documents which don't exactly > follow the TriG/ABC definition. > > Ditto N-Quads - in a quadstore/database dump or extract you don't > necessary know the semantics. > > - - - - - - - - - - > > I find the name TriG/REST confusing because, for me, identifying the > dereference action is modelling REST which is the other ... the other, naming equality, style discussed. <uuid:nnnnn> { ... triples fetched in operation 8671 } Andy > > It's more like "TriG/WebCache" -- only one instance of the graph > containers state is possible. > > Andy > >> >> Regards, >> Dave >> >> >> On Jan 4, 2012, at 1:45 PM, Sandro Hawke<sandro@w3.org> wrote: >> >>> While it's fresh in my mind, let me write down the view I came to during >>> today's telecon. (And, carry it a bit farther.) Guus, I don't know if >>> you still want to write up your understanding of it, or if this obviates >>> your action. >>> >>> >>> * Use Case 1: (presented by cygri at 21 Dec meeting) >>> >>> Several systems want to use the data gathered by one RDF crawler. They >>> don't need simultaneous access to older versions of the data. >>> >>> Solution A: use TriG or N-Quads with the fourth column (graph label) >>> being the URL the content was fetched from. >>> >>> <http://example.org> { ... triples recently fetched from there } >>> >>> * Use Case 2: (brought up in questions by sandro at 21 Dec meeting) >>> >>> Several systems want to use the data gathered by one RDF crawler. They >>> need simultaneous access to older versions of the data. >>> >>> Solution B: use TriG or N-Quads with the fourth column being some >>> identifier created at the time the retrieval was done. Then, some other >>> data connects that identifier with the URL the content was fetched from. >>> >>> <http://crawler.example.org/r8571> { ... triples fetched in retrieval >>> 8671 } >>> { >>> <http://crawler.example.org/r8571> eg:source<http://example.org>; >>> eg:date "2011-01-04T00:03:11"^^xs:dateTime >>> } >>> >>> * Use Case 3: (suggested by sandro at 4 Jan meeting) >>> >>> A system wants to convey to another system in RDF that some person >>> agrees with or disagrees with certain RDF triples. >>> >>> Solution C: use TriG or N-Quads with the fourth column being an >>> identifier for an RDF Graph (g-snap), so that it can be referred to in >>> the default graph. >>> >>> { eg:sandro eg:endorses<g1> } >>> <g1> { ... the triples I'm endorsing ... } >>> >>> ==== >>> >>> So, here we have two different semantics for TriG clearly motivated and >>> expressed. The TriG document: >>> >>> g { s p o } >>> >>> is understood in Solution A to mean (in N3): >>> >>> g log:semantics { s p o }. # TriG "REST" semantics >>> >>> but in Solution C it means (in N3): >>> >>> g owl:sameAs { s p o }. # TriG "Equality" semantics >>> >>> ==== >>> >>> It looks like it's possible to solve all three uses cases with either >>> semantics, although it gets a bit tricky. >>> >>> With TriG/REST: >>> >>> UC1 -- as reported by Richard; the URL used by the crawler is >>> the fourth column URL >>> >>> UC2 -- as implemented in Sandro's semwalker code; the crawler >>> makes a new URL in its own web space, mirrors the content there, >>> and puts that URL in the fourth column >>> >>> UC3 -- rather than endorsing an RDF Graph, I endorse a Graph >>> Container on the condition that it never changes (or something >>> like that -- needs to be fleshed out more). >>> >>> { eg:sandro eg:endorses<g1>. >>> <g1> a rdf:StaticGraphContainer. >>> } >>> <g1> { ... the triples I'm endorsing ... } >>> >>> >>> With TriG/Equality: >>> >>> UC1 -- A layer of indirection is needed, as new URIs need to be >>> created for the different RDF Graphs. >>> >>> {<http://example.org> rdf:graphState<uuid:nnnnn> } >>> uuid:nnnnn { ... triples fetched from example.org } >>> >>> Maybe there's some clever way to do it without this, involving >>> URL mangling or something to eliminate the second lookup. >>> >>> I used uuid:nnnnn as a URI for the RDF Graph, but I could just >>> as easily have used a hash of the graph or graph serialization. >>> I *could* use an http URL, I think, but that's likely to lead to >>> confusion and breakage, especially when someone gets the bright >>> idea of changing what triples are served at that address. (I'm >>> sure it will have seemed like a good idea at the time.) >>> >>> UC2 -- Pretty straightforward, since we already have that layer of >>> indirection in UC1. We can't quite use the r8571 example as is, >>> because graph >>> equality could smoosh the two retrieval operations together. So we >>> need something like this: >>> >>> <uuid:nnnnn> { ... triples fetched in operation 8671 } >>> { >>> [ a eg:Retrieval; >>> eg:gotGraph<uuid:nnnnn>; >>> eg:source<http://example.org>; >>> eg:date "2011-01-04T00:03:11"^^xs:dateTime; >>> ] >>> } >>> >>> UC3 -- easy: >>> >>> { eg:sandro eg:endorses<uuid:nnnnn>. } >>> <uuid:nnnnn> { ... the triples I'm endorsing ... } >>> >>> Here you can see why I want a blank node as the graph >>> label, rather than making up uuids. >>> >>> Between these two, I have a preference for TriG/REST over >>> TriG/Equality, I think. I think people are too likely to get the >>> semantics of TriG/Equality wrong in practice. Of course, spelling out >>> the semantics of TriG/REST will be a tricky given it has a some >>> contextual qualities, as we've discussed. >>> >>> MEANWHILE, we have a third solution, where we name the relation >>> explicitly. This is the one I prefer. >>> >>> UC1 >>> >>> <http://example.org> rdf:graphState { ... triples recently fetched >>> from there } >>> >>> UC2 -- either of the styles given above, depending whether the >>> harvester wants to publish its copies on the web or not. >>> >>> UC3 >>> >>> eg:sandro eg:endorses<uuid:nnnnn>. >>> <uuid:nnnnn> owl:sameAs { ... the triples I'm endorsing ... } >>> >>> or, logically: >>> >>> eg:sandro eg:endorses { ... the triples I'm endorsing ... } >>> >>> (Then, I would probably get rid of the curly braces around the default >>> graph, so it becomes Turtle with Nesting.) >>> >>> Do those three solution designs make sense? Any strong preferences >>> among them? Are there more use cases that people think the group will >>> find compelling and which cannot be solved by all three of these >>> solutions? (I think the next use case I'd approach would be "Tracing >>> Inference Results", mostly because it motivates shared blank nodes. >>> But I'm out of time for today.) >>> >>> -- Sandro >>> >>> >>> >>> >>> >>> >>> >> >
Received on Thursday, 5 January 2012 14:49:38 UTC