- From: Pierre-Antoine Champin <pierre-antoine.champin@liris.cnrs.fr>
- Date: Thu, 05 Jan 2012 17:36:03 +0100
- To: Guus Schreiber <guus.schreiber@vu.nl>
- CC: Sandro Hawke <sandro@w3.org>, public-rdf-wg <public-rdf-wg@w3.org>
On 01/05/2012 03:13 PM, Guus Schreiber wrote:
> On 04-01-2012 19:45, Sandro Hawke wrote:
>> While it's fresh in my mind, let me write down the view I came to during
>> today's telecon. (And, carry it a bit farther.) Guus, I don't know if
>> you still want to write up your understanding of it, or if this obviates
>> your action.
>
> Sandro,
>
> Very clear, thanks for this (and I indeed see no more need for my action).
>
> I understand why you like third solution. However, it means we have to
> come up with a name-to-graph relation vocabulary.
Why so? We can suggest some useful predicates, as Sandro did in his
example, but in the end, people will use whichever predicate they want.
> I don't think we're in
> a a position to standardize that. Also, it means we have to introduce
> new syntax and thus invalidate/deprecate/mark as archaic/... the
> current quad stores out there.
Not if we state how they relate to the new syntax (as syntactic sugar,
as suggested by Sandro).
> That would not be in the spirit of our
> charter.
>
> The first solution (your TriG/REST) has the advantage of being the most
> conservative extension, i.e. providing a hook for explicating
> name-to-graph semantics without enforcing it.
I sympathize with Andy's concern:
> I think we should not fix a semantics for undeclared relationships.
> Otherwise, it invalidates existing TriG documents which don't exactly
> follow the TriG/ABC definition.
and Trig/REST does fix a semantics.
I think the only way out would be to define a very loose predicate
(rdf:hasRelatedGraph ?). This makes it impossible to infer much from any
Trig file, but at least it does not break any of them.
pa
> We can write a
> non-normative section/appendix/note with suggested practices for
> vocabulary to be used there.
>
> If we can get consensus on (some variant of) the first solution, I see
> us moving on quickly. The path forward would be to continue writing down
> further use-case examples.
> I have asked Antoine Isaac to do this for the Europeana data model [1].
>
> Guus
>
> [1]
> http://www.europeana-libraries.eu/web/europeana-project/technicaldocuments/
>
>>
>>
>> * Use Case 1: (presented by cygri at 21 Dec meeting)
>>
>> Several systems want to use the data gathered by one RDF crawler. They
>> don't need simultaneous access to older versions of the data.
>>
>> Solution A: use TriG or N-Quads with the fourth column (graph label)
>> being the URL the content was fetched from.
>>
>> <http://example.org> { ... triples recently fetched from there }
>>
>> * Use Case 2: (brought up in questions by sandro at 21 Dec meeting)
>>
>> Several systems want to use the data gathered by one RDF crawler. They
>> need simultaneous access to older versions of the data.
>>
>> Solution B: use TriG or N-Quads with the fourth column being some
>> identifier created at the time the retrieval was done. Then, some other
>> data connects that identifier with the URL the content was fetched from.
>>
>> <http://crawler.example.org/r8571> { ... triples fetched in retrieval 8671 }
>> {
>> <http://crawler.example.org/r8571> eg:source<http://example.org>;
>> eg:date "2011-01-04T00:03:11"^^xs:dateTime
>> }
>>
>> * Use Case 3: (suggested by sandro at 4 Jan meeting)
>>
>> A system wants to convey to another system in RDF that some person
>> agrees with or disagrees with certain RDF triples.
>>
>> Solution C: use TriG or N-Quads with the fourth column being an
>> identifier for an RDF Graph (g-snap), so that it can be referred to in
>> the default graph.
>>
>> { eg:sandro eg:endorses<g1> }
>> <g1> { ... the triples I'm endorsing ... }
>>
>> ====
>>
>> So, here we have two different semantics for TriG clearly motivated and
>> expressed. The TriG document:
>>
>> g { s p o }
>>
>> is understood in Solution A to mean (in N3):
>>
>> g log:semantics { s p o }. # TriG "REST" semantics
>>
>> but in Solution C it means (in N3):
>>
>> g owl:sameAs { s p o }. # TriG "Equality" semantics
>>
>> ====
>>
>> It looks like it's possible to solve all three uses cases with either
>> semantics, although it gets a bit tricky.
>>
>> With TriG/REST:
>>
>> UC1 -- as reported by Richard; the URL used by the crawler is
>> the fourth column URL
>>
>> UC2 -- as implemented in Sandro's semwalker code; the crawler
>> makes a new URL in its own web space, mirrors the content there,
>> and puts that URL in the fourth column
>>
>> UC3 -- rather than endorsing an RDF Graph, I endorse a Graph
>> Container on the condition that it never changes (or something
>> like that -- needs to be fleshed out more).
>>
>> { eg:sandro eg:endorses<g1>.
>> <g1> a rdf:StaticGraphContainer.
>> }
>> <g1> { ... the triples I'm endorsing ... }
>>
>>
>> With TriG/Equality:
>>
>> UC1 -- A layer of indirection is needed, as new URIs need to be
>> created for the different RDF Graphs.
>>
>> {<http://example.org> rdf:graphState<uuid:nnnnn> }
>> uuid:nnnnn { ... triples fetched from example.org }
>>
>> Maybe there's some clever way to do it without this, involving
>> URL mangling or something to eliminate the second lookup.
>>
>> I used uuid:nnnnn as a URI for the RDF Graph, but I could just
>> as easily have used a hash of the graph or graph serialization.
>> I *could* use an http URL, I think, but that's likely to lead to
>> confusion and breakage, especially when someone gets the bright
>> idea of changing what triples are served at that address. (I'm
>> sure it will have seemed like a good idea at the time.)
>>
>> UC2 -- Pretty straightforward, since we already have that layer of
>> indirection in UC1. We can't quite use the r8571 example as is, because graph
>> equality could smoosh the two retrieval operations together. So we
>> need something like this:
>>
>> <uuid:nnnnn> { ... triples fetched in operation 8671 }
>> {
>> [ a eg:Retrieval;
>> eg:gotGraph<uuid:nnnnn>;
>> eg:source<http://example.org>;
>> eg:date "2011-01-04T00:03:11"^^xs:dateTime;
>> ]
>> }
>>
>> UC3 -- easy:
>>
>> { eg:sandro eg:endorses<uuid:nnnnn>. }
>> <uuid:nnnnn> { ... the triples I'm endorsing ... }
>>
>> Here you can see why I want a blank node as the graph
>> label, rather than making up uuids.
>>
>> Between these two, I have a preference for TriG/REST over
>> TriG/Equality, I think. I think people are too likely to get the
>> semantics of TriG/Equality wrong in practice. Of course, spelling out
>> the semantics of TriG/REST will be a tricky given it has a some
>> contextual qualities, as we've discussed.
>>
>> MEANWHILE, we have a third solution, where we name the relation
>> explicitly. This is the one I prefer.
>>
>> UC1
>>
>> <http://example.org> rdf:graphState { ... triples recently fetched from there }
>>
>> UC2 -- either of the styles given above, depending whether the
>> harvester wants to publish its copies on the web or not.
>>
>> UC3
>>
>> eg:sandro eg:endorses<uuid:nnnnn>.
>> <uuid:nnnnn> owl:sameAs { ... the triples I'm endorsing ... }
>>
>> or, logically:
>>
>> eg:sandro eg:endorses { ... the triples I'm endorsing ... }
>>
>> (Then, I would probably get rid of the curly braces around the default
>> graph, so it becomes Turtle with Nesting.)
>>
>> Do those three solution designs make sense? Any strong preferences
>> among them? Are there more use cases that people think the group will
>> find compelling and which cannot be solved by all three of these
>> solutions? (I think the next use case I'd approach would be "Tracing
>> Inference Results", mostly because it motivates shared blank nodes.
>> But I'm out of time for today.)
>>
>> -- Sandro
>>
>>
>>
>>
>>
>>
>>
>
Received on Thursday, 5 January 2012 16:39:13 UTC