Re: [All] Proposal: RDF Graph Identification (definition of "named graph") from Sandro Hawke on 2012-08-16 (public-rdf-wg@w3.org from August 2012)

From: Sandro Hawke <sandro@w3.org>
Date: Thu, 16 Aug 2012 15:52:53 -0400
To: Pat Hayes <phayes@ihmc.us>
CC: public-rdf-wg@w3.org
Message-ID: <502D4F95.5040302@w3.org>

On 08/16/2012 01:45 PM, Pat Hayes wrote:
> 4. "Note that “named graph” is a relation, not a class: we say that something is a named graph of a dataset, not simply that it is a named graph." What does this mean? Is it intended to convey the idea that the naming is local to the dataset? If not, what is it supposed to convey? Put another way, what is wrong with saying that something just is a named graph?

That text is something I wrote a while ago, trying to put forward a
definition of the term "named graph" that was consistent with popular
usage and also made sense to me. Clearly I didn't express it well
enough. Let's see if I can do it better now, not writing in a spec.

As you may remember, the term "named graph" used to bother me a lot.
It bothered me because:

* when I hear "graph" I really do think of RDF Graphs (g-snaps). I
think most people comfortable with the term "named graph" are actually
thinking of a g-box when they hear "graph" in this context.

* when I think of "naming" in RDF, I think of picking an IRI for some
entity and encouraging everyone in the world to use that same IRI for
that same entity, like http://www.w3.org/People/Berners-Lee/card#i for
TimBL. Often people working with "named graphs" are really just
associating a string with the graph, in some local name-binding
relationship.

Given those meaning, making a "named graph" would be kind of silly.
Richard captured this beautifully by pointing out
http://en.wikipedia.org/wiki/Gallery_of_named_graphs . Those are named
graphs, given my understanding of the words "named" and "graph"
(although using non-RDF variants of both those terms). I don't know
how common my understanding here is; I do know it's shared by TimBL, so
it's not just me.

There's also the problem that the formal definition provided by Carroll
et al [*] and used by SPARQL is that the "named graph" is not a graph at
all, but is rather a pair of a name and a graph. As I hear people talk,
using the term "named graph", they don't seem to be referring to a pair.
I've managed to find a few cases in English that use this kind of
construct: the weight of a clothed person is the sum of the weight of
the clothing and the clothed person (that is, the person who happens to
be clothed). So, yes, a named graph can be both the pair of a name and
a graph *and* the graph that is paired with the name. But that seems
really awkward.

The way I hear the term "Named Graph" used, in diverse settings
including the SPARQL WG and intro-to-semweb classes, is as a place
within a collection of RDF triples where some triples are set aside,
kept somehow separate. Someone has a bunch of triples and to help
manage them better they subdivide the collection into "Named Graphs".
Often, but not always, this is a mutable collection -- certain triples
are added to certain named graphs from time to time, as circumstances
change.

I'm pretty sure the term is always used in the context of a larger
dataset/graphstore. People don't refer to a single file or web page of
RDF triples as a Named Graph. It's my understanding that when
people said they really wanted Named Graphs [1] [2], this is what they
were talking about -- the ability to segment or subdivide a triplestore,
to help with various kinds of data management, including managing
changes and provenance. In a sense, it might better be called a
"subdivision", or a "named subgraph".

So, back to spec text.

SPARQL formally defines a /named graph/, to be any of the (name,
graph) pairs in a dataset
<http://www.w3.org/2012/08/RDFNG.html#dfn-dataset>.

True. And I wish we could propose a transition path to a less confusing
terminology. I think those things should be called "name-graph
pairs". Hard to change now, I know.

In practice, the term is often used to refer to the graph part of
those pairs. This is the usage we follow in this document, saying
that a graph is a named graph in some dataset if and only if it
appears as the graph part of a (name, graph) pair in that dataset.

I'm still happy with that definition, and comfortable using the term
"Named Graph" when defined this way. The "graph" part is an RDF Graph.
The name denotes some object in the normal (global) RDF way, and that
denoted object is associated with the graph in a dataset-local
name-graph binding pair. Nearly all of the value of particular dataset
is its name-graph pairs, so of course they're local.

Note that “named graph” is a relation, not a class: we say that
something is a named graph /of a dataset/, not simply that it is a
named graph.

It seems to me that it's nonsense to ask whether the graph { <a> <b> 1 }
is a named graph. There is no class of named graphs. Instead we'd
have to ask whether the graph { <a> <b> 1 } is a named graph of some
particular dataset.

Linguistically, the term "named graph" seems like the name of a class of
things, like "red car". But it's more like "descendant", "friend", and
"neighbor". If I say Joe is a descendant, ... well, that doesn't make
much sense. Instead, a complete sentence would have to be more like:
Joe is a descendant of Irish Immigrants.

It's like saying "7 is a prime factor" instead of "7 is a prime factor
of 8638".

You asked:

> What does this mean? Is it intended to convey the idea that the naming is local to the dataset? If not, what is it supposed to convey? Put another way, what is wrong with saying that something just is a named graph?

Have I answered that, now?

-- Sandro

[*] I think I've heard both you and Jeremy express that you don't think
we should stick to that any more.
[1] http://www.w3.org/2010/06/rdf-work-items/table
[2] http://www.w3.org/2002/09/wbs/1/rdf-2010/results

And assigning IRIs to g-snaps in a global name mapping -- the way we
name is kind of a silly thing, in general.

but I eventually found an way of thinking about it that I was
comfortable with. I was trying to capture that, but i clearly failed to
capture it clearly.

One of the problems with the term is that linguistically it looks a
named graph should be a kind of graph,

Received on Thursday, 16 August 2012 19:53:07 UTC