What we mean by "graph" / Named Graphs in SD

One of the downsides of trying to do a lot is that we make mistakes.
Or, at least, I do...

On the Service Description issue, I made the mistake of thinking a
NamedGraph was a Graph.   

Andy made a comment that led me to actually look in Carroll et al 2005
and see that, no, formally speaking "Named Graphs" would be better
called "graph namings".  They are disjoint from RDF graphs; they consist
of at least a "name" and an "rdfgraph".   So when I say it's crazy that
named graphs are named with themselves -- the craziness really rests in
Carroll et al calling something a named graph when it is not a graph.
The NamedGraph may well have a name, but that's distinct from its
"name", which is what it associates with its "rdfgraph".   (Arg!)

I wonder if there's a way out of this mess....    I guess we can at
least clarify it in the SD document, with some warnings.   Anyone
interesting in us getting away from the misleading term "NamedGraph"?
It may seem like it's entrenched in SPARQL 1.0, but this is just
editorial, as long as it keeps the GRAPH and FROM NAMED keywords, which
actually seem fine to me.   (I expect we'll have a new RDF Core WG
working on this issue in a few months; that's when this will really get
entrenched.  I expect to strongly oppose this misleading use of the term
"named graph" during that work, if necessary.)

Aside from that, it would help a little to change sd:name to
sd:graphName, but it should still be a string if it's like that.
(Maybe we should add RIF's pref:iri-string as a builtin, to address
Andy's use case.  Remind me why RIF and SPARQL have their own builtins
and you can't just borrow from the other?)   Alternatively just use the
name as the URI label on the graph node; that seems fine...

Hmmm.

Meanwhile, I've been meaning to send a question about our use of the
term "Graph", which is connected here.

It seems to me there are two different common meanings for the term
"RDF Graph".  To use the AI terms for each of them:

        1. A Knowledge Base (KB); a specific repository or store of RDF
        triples.  As in, "Please update your graph to remove the triple
        <a> <b> <c>."
        
        2. A Formula; a mathematical set of RDF triples.   As in, "Graph
        G1 entails infinite other graphs".
        
The most crisp distinction may be around identity.   Two formulas are
identical if and only if they contain the same triples.  Meanwhile, KBs
can have the same triples while remaining distinct.   It also makes
sense to talk about the state of a KB, and a KB changing over time.  It
makes no sense to say such things about a formula; it's just a pure
mathematical set.

I think we can agree that formally, technically, only definition 2
(formulas) is correct.  But I think meaning two is in common use; I
expect most of us use it often.    When I say "graph" in the sense of
definition 1, I mean it as shorthand for "graph storage location",
"graph data structure", or "graph store".   In spoken language, the
context usually makes it clear whether people mean KB or formula.

The distinction also didn't matter much in SPARQL 1, I think, because
it was agnostic on mutability, identity, etc.  I guess this will come
up in the update semantics for 1.1.  And it may come up in the SD
issue, above.

I wonder if we could agree on a standard name for sense 1, and try to
use it.   (Or maybe we already did, and I missed it.)   As long as it's
a term like "graph storage location", then using the keyword "GRAPH" as
we do in the query language seems fine.

    -- Sandro

Received on Tuesday, 20 July 2010 18:22:45 UTC