Re: Comment on the Dataset proposal (syntax) from Sandro Hawke on 2012-04-26 (public-rdf-wg@w3.org from April 2012)

From: Sandro Hawke <sandro@w3.org>
Date: Thu, 26 Apr 2012 12:13:01 -0400
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Cc: Richard Cyganiak <richard@cyganiak.de>, RDF WG <public-rdf-wg@w3.org>
Message-ID: <1335456781.9663.481.camel@waldron>
On Thu, 2012-04-26 at 17:30 +0200, Antoine Zimmermann wrote:
> Hi,
> 
> 
> (This email is mostly for Richard's attention)
> 
> Putting aside the discussion on dataset semantics, I have a few comments 
> on the way the dataset proposal is described in terms of syntax:
> 
> 
> "The RDF data model expresses information as graphs consisting of 
> triples with subject, predicate and object."
> 
> The word "graph", in the RDF specifications, should never appear alone 
> like this. It is well known that a graph is a pair (V,E) where V is a 
> set of vertices and E is a set of edges. This is not what RDF Graphs 
> are. RDF Graphs are not graphs, in any of the accepted mathematical 
> definition of the term. 

Aren't RDF Graphs a kind of graph?   The restrictions, I think, are that
there are no unconnected vertices, the edges are directed and labeled
with an IRI, and the nodes may be labeled with an IRI or a datatype
expression.   If this is true, that every RDF Graph is a graph, then I
think linguistically it's okay to sometimes use the term "graph" if it
makes the text read better and doesn't introduce too much ambiguity.

> We already agreed that the word "graph" alone is 
> ambiguous and we resolved to use the phrase "RDF Graph" whenever we talk 
> about sets of triples.
> 
> SUGGESTION:
> "The RDF data model expresses information as RDF Graphs consisting of a 
> set of triples with subject, predicate and object."
> 
> -----
> 
> "Often, one wants to hold multiple RDF graphs and record information 
> about each graph, allowing an application to work with datasets that 
> involve information from more than one graph."
> 
> SUGGESTION:
> "... each RDF Graph, ... than one RDF Graph."
> 
> To sound less redundent, "hold multiple RDF graphs and record 
> information about each one, ..."
> 
> -----
> 
> "An RDF Dataset represents a collection of graphs. An RDF Dataset 
> comprises one graph, the default graph, which does not have a name, and 
> zero or more named graphs, where each named graph is identified by an IRI."
> 
> Maybe say "distinguished RDF Graph":
> 
> SUGGESTION:
> "An RDF Dataset comprises one distinguished RDF Graph, the /default 
> graph/, which does not have a name, ..."
> 
> Moreover, the word "identified" may be missinterpreted.
> 
> SUGGESTION:
> "..., where each named graph associates an IRI with an RDF Graph."
> 
> -----
> 
> "An RDF Dataset may contain zero named graphs; an RDF Dataset always 
> contains one default graph."
> 
> SUGGESTION:
> add "The default graph MAY be empty."
> 
> -----
> 
> Maybe a definition for "named graph" could be given before the formal 
> definition of RDF Dataset:
> 
> SUGGESTION:
> "A /named graph/ is a pair (n,g) where n is an IRI called the /graph 
> name/ and g is an RDF Graph."
> 
> -----
> 
> "Formally, an RDF dataset is a set:
> 
> { G, (<u1>, G1), (<u2>, G2), . . . (<un>, Gn) }
> 
> where G and each Gi are graphs, and each <ui> is an IRI. Each <ui> is 
> distinct."
> 
> "... are RDF Graphs, ..."
> 
> ----
> 
> "G is called the default graph. The pairs (<ui>, Gi) are called named 
> graphs."
> 
> If "named graph" is defined before, it could look like this:
> 
> SUGGESTION:
> "G is called the default graph. The pairs (<ui>, Gi) are named graphs."

I have to say (again) that I'm not okay with calling something a "named
graph", especially formally, when it isn't named and isn't a graph (or
RDF Graph).   If we have to use the terms "name" and "graph", then the
pair (ui, Gi) is a name-graph pair, and Gi is the named graph.

I don't think wordsmithing this section will productive until/unless we
have a shared understand of what we actually want to say, though.

    -- Sandro
Received on Thursday, 26 April 2012 16:13:17 UTC