Re: defn of Named Graph from Sandro Hawke on 2013-09-19 (www-archive@w3.org from September 2013)

From: Sandro Hawke <sandro@w3.org>
Date: Thu, 19 Sep 2013 12:52:47 -0400
To: Dan Brickley <danbri@danbri.org>, Jeremy J Carroll <jjc@syapse.com>
CC: Gregg Reynolds <dev@mobileink.com>, Pat Hayes <phayes@ihmc.us>, www-archive <www-archive@w3.org>
Message-ID: <523B2BDF.30302@w3.org>
On 09/19/2013 04:01 AM, Dan Brickley wrote:
>
>
>
> On 18 September 2013 19:33, Jeremy J Carroll <jjc@syapse.com 
> <mailto:jjc@syapse.com>> wrote:
>
>
>     Something of an aside …
>
>     On Sep 18, 2013, at 1:29 AM, Gregg Reynolds <dev@mobileink.com
>     <mailto:dev@mobileink.com>> wrote:
>
>>     The suggestion that a pair of mathematical entities with exactly
>>     the same extension are not equal doesn't help - it reads like an
>>     attempt to redefine mathematics. 
>
>
>     Gregg
>
>     I think you misunderstand mathematics ...
>
>     I attach two pictures.
>
>     The first is my copy, of Jones' copy of a diagram in a book in the
>     vatican library which is a tenth century, maybe fifth generation,
>     copy of a diagram drawn by Pappus of Alexandria in the 4th
>     century, which may in turn have been a (n-th generational) copy of
>     a diagram drawn by Euclid a few hundred years earlier.
>     The copy in the vatican library, has, according to Jones, got a
>     mistake in it: which he corrected, assuming it to be a copyist's
>     error and not an error of Pappus or Euclid.
>
>     All these copies will have minor variations .. such as angles and
>     distances and sizes being slightly different [...]
>
>
> Thanks - you confirm a hunch I had earlier in this thread. I started a 
> mail but couldn't find a way to make it clear: these distinctions 
> we're drawing around graphs echo very similar concerns people have for 
> bibliographic modeling of intellectual works, their literary 
> expressions, manifestations and tangible representation as items - to 
> use the FRBR terminology. This is not to suggest for a moment that 
> FRBR is the best conceptualization of graph change, state, and 
> versioning; only that perhaps it might help to see this as not a 
> distinctively RDF-oriented problem.
>
> http://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records 
>
>

Yeah, I think that observations was probably made before about g-snaps, 
g-boxes, and g-texts, but it was forgotten.

It's nice to hear, in a way, because it excused our difficulty in 
solving this problem.    We might say that httpRange-14 and dataset 
semantics are FRBR-complete or FRBR-hard problems --- that is, if you 
could solve this, you could make a decent FRBR solution and/or if there 
were a decent FRBR solution, you could solve this.

My hope is that we could solve this more easily than FRBR, since we have 
running code.

But it's not clear the running code matters very much, alas, for these 
use cases.

In my proposal starting this thread, I handwaved over why you might ever 
want to have metadata about some collection of triples which is 
independent of the the triples in it.   For instance, in my head, I have 
no idea whether dc:creator, cc:license, dc:date, etc, make more sense 
applied to the g-box or the g-snap.   (I hope you'll tolerate my using 
those place-holder terms for clarity.)      Until people need to change 
the triples, though, I can't really tell.

I mean, if I copyright a certain 5000 character string, and you 
independently copyright the same 5000 character string, how does the law 
deal with that?  I guess it doesn't care that in a sense they are 
different strings -- the law probably treats them as the same, and the 
chonologically-second one is infringing.    So for copyright law, 
cc:license and dc:creator and dc:date are properties of a g-text.   That 
might extend to the g-snap that is essentially the parse-tree of that 
g-text.   It doesn't matter which g-boxes might be involved.

I tried to skirt that using the idea of coincidentally having the same 
triples, but that's maybe too theoretical and unlikely to be helpful.

So, I hereby propose we give up on all this until after we solve the 
change-over-time problem for RDF.    I'm happy for us to talk that out 
amongst ourselves, or to do it in a community group, or...  I dunno.  
But obviously it's not an RDF WG thing.

As a first draft, I might state that problem as:

    Sometimes people write context-sensitive RDF like { :Alice :age 10
    }, instead of decontextualized RDF like { :Alice :born 1852 }.  It
    would be helpful to have a standard way to indicate and reason about
    the intended context of context-sensitive graphs.   (In this case,
    the context of the first graph would have to be 1862 for both graphs
    to be true, give or take time-of-year factors.)

    Meanwhile, even RDF which is not inherently context-sensitive (like
    the above graph using the :age predicate), often turns out to be
    context-sensitive because it conveys something about the state of
    the world, and the state of the world sometimes changes. For
    instance, a foaf:name triple might turn out to be true for only
    certain years, if the subject changes their name.   And a foaf:mbox
    triple is true only when the subject has the given email address.

    Finally, even when an RDF graph contains information that in theory
    never changes, like birth dates or molecular weights of chemicals,
    in practice it might change because of errors being corrected or the
    truth becoming known with more precision.     For example, with a
    little historical research we might learn that the girl who inspired
    Alice in Wonderland was 10 in 1862, and put that in an RDF Graph.  
    With more research, we might discover her actually birthdate was 4
    May 1852, and update our RDF database accordingly.

    Aside from these issues of change-over-time, spacial context might
    turn out to be important to track.  Do people want to write graphs
    like { :SanFrancisco a :NearbyCity }, which are true only for an
    observer near San Francisco?

    And, of course, it is vital when gathering RDF data from many
    sources to establish and reason about the trustworthiness of each
    source.

    The challenge here is to provide a general model for how RDF data
    can be managed coming from multiple different sources, with
    different contexts and trustworthiness.  Further. we should if
    necessary define vocabulary terms and other mechanisms to improve
    interoperability and functionality of general RDF data exchange.

Now, of course, I'm thinking about the Dilbert Problem [1] (and [2]).   
My solution would be something like this:

    GRAPH :2011q1 {
    <http://example.com/e-1> <http://example.com/hasCubicle>
    <http://example.com/c-1000> .
    <http://example.com/e-2> <http://example.com/hasCubicle>
    <http://example.com/c-1001> .
    <http://example.com/e-3> <http://example.com/hasCubicle>
    <http://example.com/c-1002> .
    }
    GRAPH :2011q2 {
    <http://example.com/e-1>
    <http://example.com/hasCubicle> <http://example.com/c-1001> .
    <http://example.com/e-2>
    <http://example.com/hasCubicle> <http://example.com/c-1000> .
    <http://example.com/e-3>
    <http://example.com/hasCubicle> <http://example.com/c-1002> .
    }
    :2011q1 dc:temporal [ :begins "2011-01-01"^^xs:DateTimeStamp; :ends
    "2011-03-20"^^xs:DateTimeStamp ].
    :2011q2 dc:temporal [ :begins "2011-03-20"^^xs:DateTimeStamp ].


I'm wondering a little about making a Community Group for this.

More immediately, I'm wondering what the RDF WG is supposed to do about 
all this, and what I'll be telling the Director about Jeremy's comment 
at the next Transition Meeting.

      -- Sandro

[1] http://danbri.org/words/2011/11/03/753
[2] http://lists.w3.org/Archives/Public/public-rdf-wg/2011Nov/0019.html


> Dan
>
Received on Thursday, 19 September 2013 16:53:00 UTC