Re: Three solution designs to the first three Graphs use cases from David Wood on 2012-01-04 (public-rdf-wg@w3.org from January 2012)

From: David Wood <david@3roundstones.com>
Date: Wed, 4 Jan 2012 14:23:31 -0500
To: Sandro Hawke <sandro@w3.org>
Cc: public-rdf-wg <public-rdf-wg@w3.org>
Message-Id: <177B89C2-3542-4D17-9014-4787F9736671@3roundstones.com>
Thanks, Sandro.  That's very helpful.

It might be useful to consider augmenting TriG syntax to support your third solution (explicitly naming relations). I'd be quite happy with that.

We could also consider standardizing the existing TriG syntax to be a syntactic shorthand for TriG REST semantics; that is, a lack of explicitly declared relation infers log:semantics.

Regards,
Dave


On Jan 4, 2012, at 1:45 PM, Sandro Hawke <sandro@w3.org> wrote:

> While it's fresh in my mind, let me write down the view I came to during
> today's telecon.   (And, carry it a bit farther.)  Guus, I don't know if
> you still want to write up your understanding of it, or if this obviates
> your action.
> 
> 
> * Use Case 1:   (presented by cygri at 21 Dec meeting)
> 
> Several systems want to use the data gathered by one RDF crawler.  They
> don't need simultaneous access to older versions of the data.
> 
> Solution A: use TriG or N-Quads with the fourth column (graph label)
> being the URL the content was fetched from.
> 
>        <http://example.org> { ... triples recently fetched from there }
> 
> * Use Case 2:   (brought up in questions by sandro at 21 Dec meeting)
> 
> Several systems want to use the data gathered by one RDF crawler.  They
> need simultaneous access to older versions of the data.
> 
> Solution B: use TriG or N-Quads with the fourth column being some
> identifier created at the time the retrieval was done.  Then, some other
> data connects that identifier with the URL the content was fetched from.
> 
>        <http://crawler.example.org/r8571> { ... triples fetched in retrieval 8671 }
>        { 
>           <http://crawler.example.org/r8571> eg:source <http://example.org>;
>                                              eg:date "2011-01-04T00:03:11"^^xs:dateTime
>        }
> 
> * Use Case 3:   (suggested by sandro at 4 Jan meeting)
> 
> A system wants to convey to another system in RDF that some person
> agrees with or disagrees with certain RDF triples.
> 
> Solution C: use TriG or N-Quads with the fourth column being an
> identifier for an RDF Graph (g-snap), so that it can be referred to in
> the default graph.   
> 
>        { eg:sandro eg:endorses <g1> }
>        <g1> { ... the triples I'm endorsing ... }
> 
> ====
> 
> So, here we have two different semantics for TriG clearly motivated and
> expressed.  The TriG document:
> 
>        g { s p o }
> 
> is understood in Solution A to mean (in N3):
> 
>        g log:semantics { s p o }.       # TriG "REST" semantics
> 
> but in Solution C it means (in N3):
> 
>        g owl:sameAs { s p o }.          # TriG "Equality" semantics
> 
> ====
> 
> It looks like it's possible to solve all three uses cases with either
> semantics, although it gets a bit tricky.
> 
> With TriG/REST:
> 
>        UC1 -- as reported by Richard; the URL used by the crawler is
>        the fourth column URL
> 
>        UC2 -- as implemented in Sandro's semwalker code; the crawler
>        makes a new URL in its own web space, mirrors the content there,
>        and puts that URL in the fourth column
> 
>        UC3 -- rather than endorsing an RDF Graph, I endorse a Graph
>        Container on the condition that it never changes (or something
>        like that -- needs to be fleshed out more).
> 
>                { eg:sandro eg:endorses <g1>.
>                  <g1> a rdf:StaticGraphContainer.
>                }
>            <g1> { ... the triples I'm endorsing ... }
> 
> 
> With TriG/Equality:
> 
>        UC1 -- A layer of indirection is needed, as new URIs need to be
>        created for the different RDF Graphs.
> 
>                { <http://example.org> rdf:graphState <uuid:nnnnn> }
>                uuid:nnnnn { ... triples fetched from example.org }
> 
>        Maybe there's some clever way to do it without this, involving
>        URL mangling or something to eliminate the second lookup.
> 
>        I used uuid:nnnnn as a URI for the RDF Graph, but I could just
>        as easily have used a hash of the graph or graph serialization.
>        I *could* use an http URL, I think, but that's likely to lead to
>        confusion and breakage, especially when someone gets the bright
>        idea of changing what triples are served at that address.  (I'm
>        sure it will have seemed like a good idea at the time.)
> 
>        UC2 -- Pretty straightforward, since we already have that layer of
>        indirection in UC1.  We can't quite use the r8571 example as is, because graph
>        equality could smoosh the two retrieval operations together.  So we
>        need something like this:
> 
>                <uuid:nnnnn> { ... triples fetched in operation 8671 }
>            {
>              [ a eg:Retrieval;
>                eg:gotGraph <uuid:nnnnn>;
>                eg:source <http://example.org>;
>                eg:date "2011-01-04T00:03:11"^^xs:dateTime;
>              ]
>            }
> 
>        UC3 -- easy:
> 
>                { eg:sandro eg:endorses <uuid:nnnnn>. }
>                <uuid:nnnnn> { ... the triples I'm endorsing ... }
> 
>                Here you can see why I want a blank node as the graph
>                label, rather than making up uuids.
> 
> Between these two, I have a preference for TriG/REST over
> TriG/Equality, I think.   I think people are too likely to get the
> semantics of TriG/Equality wrong in practice.  Of course, spelling out
> the semantics of TriG/REST will be a tricky given it has a some
> contextual qualities, as we've discussed.
> 
> MEANWHILE, we have a third solution, where we name the relation
> explicitly.   This is the one I prefer.   
> 
>        UC1 
> 
>                <http://example.org> rdf:graphState { ... triples recently fetched from there }
> 
>        UC2 -- either of the styles given above, depending whether the
>        harvester wants to publish its copies on the web or not.
> 
>        UC3 
> 
>                eg:sandro eg:endorses <uuid:nnnnn>.
>        <uuid:nnnnn> owl:sameAs { ... the triples I'm endorsing ... }
> 
>                or, logically:
> 
>                eg:sandro eg:endorses { ... the triples I'm endorsing ... }
> 
>                (Then, I would probably get rid of the curly braces around the default
>                graph, so it becomes Turtle with Nesting.)
> 
> Do those three solution designs make sense?   Any strong preferences
> among them?  Are there more use cases that people think the group will
> find compelling and which cannot be solved by all three of these
> solutions?  (I think the next use case I'd approach would be "Tracing
> Inference Results", mostly because it motivates shared blank nodes. 
> But I'm out of time for today.)
> 
>    -- Sandro
> 
> 
> 
> 
> 
> 
>
Received on Wednesday, 4 January 2012 19:26:43 UTC