Re: Graphs and Being and Time from Pierre-Antoine Champin on 2011-02-23 (public-rdf-wg@w3.org from February 2011)

From: Pierre-Antoine Champin <pchampin@liris.cnrs.fr>
Date: Thu, 24 Feb 2011 00:02:32 +0100
To: Pat Hayes <phayes@ihmc.us>
CC: "public-rdf-wg@w3.org" <public-rdf-wg@w3.org>
Message-ID: <4D659208.5080105@liris.cnrs.fr>
Hi Pat,

+1 to make this distinction explicit; and yes, I think that "graph" 
should be used for "graph token", as this is what most people mean when 
using it.

For the "abstract graph", I would propose "graph state" or "graph 
value", as they carry the idea that the graph (token) holds one of them 
at some point of time, but may change over time.

I'm not sure about "graph literal" proposed by Nathan, as the term 
"literal" is already used in RDF, and that it may cause much confusion, 
IMHO...

   pa


On 02/23/2011 09:13 PM, Pat Hayes wrote:
> I would like the WG to rectify a basic flaw in the RDF conceptual
> model, or more exactly, how the basic RDF conceptual model is
> described in the RDF specs. This will bring the specs more in line
> with the way that RDF is actually used in practice. It will also not
> change the RDF semantics, though it will simplify their description a
> little. It is along the lines I suggested in my ISWC invited talk a
> while ago. As this will have consequences for the way we talk about
> RDF graphs generally, I would like to raise it now even though it
> does not fall into any of the assigned task topics.
>
> Here is the problem: the RDF specs define an RDF graph to be a
> **set** of triples. But a set is a pure-mathematical Platonic
> abstraction, like a number or an Abelian group. It is not a data
> structure or a document or a text, and it cannot be transmitted by
> HTTP or FTP or any other XXTP. And, worst of all, it has no state, so
> it can't be 'changed'. It simply does not make sense, given the
> definitions in the current RDF spec, to speak of a 'temporal graph'
> or of a graph being 'changed'. If i 'add' an item to a set - say, add
> C to {A, B} to get {A, B, C} -  I have not "changed" anything: I
> simply have a new set, different from the previous one: {A, B} =/=
> {A, B, C}. Sets belong in the world of mathematics, not the world of
> computing.
>
> What we need is the notion of a 'graph token' (or some other
> terminology: see below for more on terminology), meaning an actual
> representation of an RDF graph. This  would be an information
> resource, a thing with representations that can be copied and sent
> from place to place using a TP. Put another way, this would have the
> same kind of relationship to an RDF graph that a particular copy of
> Moby Dick has to the literary work with the same name, or that a
> particular token of the letter 'A' has to the first letter of the
> English alphabet; and just as with these cases, there can be many
> tokens of the same RDF graph. I might have my copy and you might have
> yours: same graph, different tokens. And we can make our own rules
> for token identity, so it can make perfect sense for tokens to have a
> state, and a single token to be a token of different RDF graphs as
> changes are made to it, which is what we are actually currently
> talking about when we use the impossible terminology of "changes to a
> graph". To emphasize: we already have these things. Every RDF/XML
> document on the Web, every piece of RDFa, is actually a graph token
> rather than an actual RDF graph (where I am using "RDF graph" here
> strictly according to the RDF specs, of course.) You cannot put an
> actual RDF graph into a digital memory, any more than you can put the
> number three into one. You have to use a numeral to represent a
> number, and you have to use a graph document or a graph data
> structure or some such token-like thing to represent the RDF graph.
> The issue is only that the RDF specs currently don't acknowledge this
> simple fact: they represent a kind of idealized fiction that refuses
> to acknowledge the distinction between a work and a book, or between
> a number and a numeral, or between a graph and an encoding of it. If
> one reads the various specs which mention RDF, they vary in their use
> of the term "RDF graph". Some of them use it mean a graph token,
> others to mean the Platonic abstraction; and still others seem to be
> kind of muddled. We saw some of this muddle in the IRC log of today's
> telecon, in fact.
>
> In the ISWC talk I invented a completely new terminology of a
> "surface" (think of a piece of paper on which the graph is
> conceptually 'drawn'), which I rather like, but we don't have to go
> that far. In fact, I would propose that we keep the terminology "RDF
> graph" to refer to the tokens (which is already now a common usage)
> but alter the specs so that the current RDF graph - the set of
> triples - is called something like an "RDF abstract graph". Then an
> RDF graph is not a set, but rather something like an RDF "resource"
> (In the REST sense), ie an entity which emits a representation of an
> RDF abstract graph when poked. This allows an RDF graph to have a
> state that can change, and it brings the whole business of naming a
> graph with a URI into line with all other kinds of Web naming and
> identification. This is in fact the way the world actually is, of
> course: the change is simply bringing the terminology into line with
> actual practice.
>
> The nice thing is, if we do modify the specs (actually, the RDF
> conceptual model) to be more realistic in this way, by making an
> explicit distinction between the abstract graph and a particular
> graph token (think of a document), then several things get simpler
> and some "issues" go away.
>
> 1. An RDF graph (new sense, ie a graph token) is now something that
> can have an identity over time (corresponding to continuing to be
> identified by a cool IRI, in the usual Web-sanctioned way), so this
> whole way of talking now makes sense.
>
> 2.  It is quite sensible to have two RDF graphs (tokens) with
> different names which are the same RDF (abstract) graph. That is, two
> graph tokens which look like (i.e., when poked emit representations
> of) the same RDF abstract graph. This has always been an issue for
> the idea of 'named graphs': how can a name be attached to a
> particular RDF **abstract** graph (as opposed to some document or
> representation of that abstract graph)? And OK, the answer is: it
> can't, and this does not matter, because all we are ever needing to
> identify are graph tokens, not abstract graphs. You name a graph by
> identifying a token of it. But that only gives you power over the
> token, not over the abstraction itself.
>
> 3. There is now a very nice way to handle blank nodes: we simply
> stipulate, as a part of the underlying RDF conceptual model, that
> every blank node can occur in at most one RDF graph token. Blank
> nodes are unique to tokens.  Intuitively, we think of the blank nodes
> in any graph token as belonging to the token itself rather than to
> the abstract graph. This at one stroke fixes the 'scope' of blank
> node identifiers in any RDF surface syntax or notation (whatever is
> the boundary of the RDF graph token according to the rules of that
> syntax, that is where the existential quantifiers are that bind the
> bnode identifiers) and it also eliminates the need to define 'graph
> merging' as opposed to 'graph unioning' in the specifications. (If
> you don't follow this, just believe me. It makes the specs a lot
> easier and quite a bit shorter.)
>
> Anyway, I offer this as an item for the WG to consider. I don't have
> any particular brief for the choice of terminology, but I do think it
> is important for us to agree on the basic conceptual distinction
> (between the 'abstract' idea of an RDF graph as a mathematical set,
> and some more concrete notion of a graph as a data object with a
> state that can be identified by a URL) and agree to use it ourselves.
> If we do, then this will impact at least the language that we use in
> the Graphs TF, and perhaps the way that we actually think about the
> issues.
>
> I hereby volunteer to write drafts, as necessary, of the relevant
> changes to the RDF Concepts and RDF Semantics documents to
> accommodate the necessary changes, if we decide to make them. I think
> that, with care, no changes would be needed to the SPARQL draft
> documents.
>
> Pat
>
>
>
> ------------------------------------------------------------ IHMC
> (850)434 8903 or (650)494 3973 40 South Alcaniz St.
> (850)202 4416   office Pensacola                            (850)202
> 4440   fax FL 32502                              (850)291 0667
> mobile phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>
>
>
>
>
>
Received on Wednesday, 23 February 2011 23:03:06 UTC