Re: RDF Model Theory Working Draft: Comment from Pat Hayes on 2001-10-08 (www-rdf-comments@w3.org from October to December 2001)

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Mon, 8 Oct 2001 12:24:08 -0500
To: aray@nyct.net
Cc: www-rdf-comments@w3.org
Message-Id: <p05101006b7e7820c9951@[205.160.76.193]>
As author of the RDF model theory draft, I must bear the 
responsibility for any misunderstandings arising from the text.  I 
was not aware of the earlier discussions on this topic in this forum, 
or I would have taken more care in my choice of words. The current 
text does indeed contain some errors, for which I apologize, and will 
be corrected in the next version.  Please allow me to explain the 
intent of the 'graph syntax' and clarify several points which have 
been discussed.

First, RDF graphs are in fact multigraphs, speaking in strict 
mathematical language, since they allow more than one arc between two 
nodes. Also, the current wording of the MT document does not make it 
clear that nodes in 'subject position', ie from which a directed edge 
emerges, must not be labelled with a literal (this is part of the RDF 
syntax description.) As to the issue you raise, I confess that this 
interpretation simply had not occurred to me, but I agree that the 
wording you cite does invite such an interpretation. The problem is 
in the wording, however, rather than in the pictures. The intention 
is *not* that all three of a triple's terms should be 'nodal'. The 
intention is that a triple

a b c .

shall map to a subgraph of the RDF graph consisting precisely of two 
nodes corresponding to a and c, together with an edge, directed from 
the former to the latter, labelled with b (as shown in figure 11 of 
http://www.w3.org/TR/REC-rdf-syntax/) There need not be any node 
labelled with 'b'; indeed, if this triple were the entire document, 
there would be no such node. The graph contains precisely one edge 
for each triple, and one node for each 'nodal' label in the N-triples 
document, ie each label which occurs in a subject or object position 
in the document somewhere. (However, a URI label which occurs only as 
a property name in the document will not be a node label anywhere in 
the graph; that is where the text you cited is incorrect, and needs 
to be rewritten more carefully.) Note that such a nodal label may 
also occur in a property position, and it will in that case be both a 
node label and an arc label in the graph. The labelling function on a 
tidy graph is required to be 1:1 only on nodes: the same label may 
occur on many edges and also on a (single) node.

With this correspondence,  an Ntriples document specifies a unique 
RDF graph; and also vice versa, up to reordering the triples and 
renaming the bNames.
(In fact it might be good mathematical style to define an equivalence 
relation on Ntriples documents (of reordering and bNode-renaming) and 
regard RDF graphs as members of the quotient space under this 
equivalence.  Another way to define the Ntriples-to-RDF graph mapping 
is as follows: define the set of nodes to be the set of 'nodal' 
identifiers, define edges to be triples, and then delete all bNode 
labels from the graph, ie replace them all with a null label.)

I hope this helps to clarify the intended meaning of the graph 
syntax. In the next version of the document we will include rigorous 
mathematical definitions in an appendix, to provide some security 
against the vagaries of English.

It may be germane to point out that with the current MT, nothing 
would be gained, speaking semantically, by including a node for each 
edge label, since a label denotes the same thing whether it is used 
as a node label or an edge label. The primary semantics for both node 
and edge labels is identical; in fact, an interpretation is defined 
simply on a set of labels, ie a vocabulary, and makes no distinction 
between edge and node labels. In order to say anything useful, the 
entity I(e) denoted by an edge label e needs to also have a nonempty 
extension IEXT(I(e)), and it is that extension which determines the 
truth or falsity of the triple (labelled edge); but I(e) is the same 
*thing* whether e is used as a node or an edge label.


Pat Hayes

PS. You wrote:
[I would claim, on the basis of my "gestalt" of RDF, that a ntriple 
should be a node itself.  The "basic" RDF graph representation would 
have four nodes, for the triple and its three terms; the arcs would 
be labelled by "subject", "verb" and "object" - or their equivalents 
du jour; anonymous nodes would just be missing outbound subject arcs; 
nodes with only inbound arcs would be the natural places to locate 
all atomic/primitive terms (literals, urirefs, symbols, whatever); 
and "reification" would seem to be unnecessary.  What this means for 
Model Theory I haven't the faintest idea.]

The reason that reification would be unnecessary in this scheme is 
that this *is* reification. The model theory can be extended to 
reification in several different ways; it was omitted from the 
current version because the proper way to handle reification is still 
under discussion.


PPS. There *are* some places where the correspondence between graph 
syntax and XML-RDF becomes rather blurry, notably concerning the 
syntactic ordering of assertions about membership in an unordered 
collection, ie a bag. These have very little, if any, bearing on the 
meaning as determined by the model theory, but they are somewhat 
troubling and the working group hopes to get them clarified 
eventually.
-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Monday, 8 October 2001 13:24:30 UTC