Re: a new way of thinking about RDF and RDF Schema

From: Brian McBride <bwm@hplb.hpl.hp.com>
Subject: Re: a new way of thinking about RDF and RDF Schema
Date: Sun, 21 Oct 2001 18:23:05 +0100

> Peter F. Patel-Schneider wrote:
> 
> [...]
> 
> >>theory in that style.  The resulting RDF model theory is necessarily more 
> >>complicated than one for a graph or triple based syntax, since in effect, the 
> >>model theory has an RDF parser built into it.  :(
> >>
> > 
> > I'm not sure what you are getting at here.  
> 
> Where a model theory based on a graph as syntax can say:
> 
> (****ing BT:  I'm offline can't quote the exact words)
> 
>    IS : node -> IR       # i.e. there is a simple function from nodes to
>                            the things they denote
> 
> this model theory has in section 4
> 
> [[[
> 
> 1. 
> for each n in N an element node,
> 	    M(n) in IR  and  M(n) in CEXT(IS(name(n)))
> 	    if n has an attribute with name rdf:ID and string-value u
> 	       then M(n) = IS(UTS(u))
> 	    if n has an attribute with name rdf:about and string-value u
> 	       then M(n) = IS(UTS(u))
> 	    if n has an attribute with name rdf:resource and string-value u
> 	       < M(n), IS(UTS(u)) > in EXT
> 	    for each element, attribute, or text node child, n', of n
> 		     except for attribute nodes with name
> 		     rdf:ID, rdf:about, rdf:resource, or xsi:type
> 		< M(n) , M(n') > in EXT
> 	    if n has a simple type, d
> 	       then for each child, n', of n that is a text node
> 		    M(n') = DTS(d)(string-value(n'))
> ]]]
> 
> which struck me rather as representing information normally handled in an RDF 
> parser, in the model theory itself.

Yes, you do have to get the ``ids'' out of the attributes, but that is not
parsing, or at least not much of parsing.  The only difference between this
and Pat's model theory is that in Pat's model theory there is one
syntactical thing that carries resource identity, namely the URI of a node, whereas
in the XQuery 1.0 Data Model there are three.

> > However, one of the things that I am trying to do here is to eliminate the
> > need for an RDF parser.  Parsers take some surface syntax (usually a linear
> > sequence of bits) and produce an abstract syntax structure.  Pat uses a
> > graph as his abstract syntax structure.  I am using the XQuery 1.0 Data
> > Model (well, actually a forest of fragments in that data model) as my
> > abstract syntax structure.
> 
> Right, but then one has to do the work normally handled by the parser, to 
> interpret the XQuery data model as RDF.  There ain't no free lunch.
> 
> > One very big (at least to my mind) advantage of my approach is that there
> > are (or soon will be) programs that produce my abstract syntax
> > structures from arbitrary XML. 
> 
> Most RDF parsers are built on a standard XML parser.  This is true of SiRPAC, 
> RDFFilter, ARP and Redland's parser.

How can this be?  RDF is not XML.  How do they handle parseType?  An RDF
system that handles parseType has to get its hands on the raw bits, before
an XML parser sees them.

[...]

> >>Your ML compiler should have issued a warning here.  This does not cover all the 
> >>cases; bnodes are not handled, i.e. what do you do about nodes with no ID, about 
> >>or resource attribute.  
> > 
> > Nodes with no rdf:ID or rdf:about are handled fine.  There is just no
> > restriction on M(n) corresponding to the rdf:ID or rdf:about, i.e., they
> > are anonymous nodes.  
> 
> ok.  I'm struggling to get my head around this way of looking at it.  I was 
> going to say that surely there must be a restriction that different anonymous 
> nodes map to different resources.  But that is not true of anonymous typed 
> nodes.  It seems it ought to be true of property element nodes though.

Nope.  They can map to the same ``resource''.  There is no particular harm,
as long as you can't tell the difference.  :-)  (In fact this is a feature
of a true model theory.  If you can't tell the difference (as in via
entailment) then it doesn't matter.)

[...]

> >>So let me add a third.  I don't see where you are handling typed nodes; e.g. the 
> >>first element above should add:
> >>
> >>    g = G()
> >>   <G(), g>
> >>   <rdf:type, g>
> >>   <g, a:b>
> >>
> >>to IEXT.  But one can't add these for all <a:b> elements, only those that are 
> >>typed nodes in the grammar.  This is a case where striping must be handled.
> >>
[...]
> > 
> > It is certainly the case that you need to link the nodes to their ``type''
> > through an rdf:type resource.  
> > 
> > This can be done by via
> > 
> > 	IS >= { <a:b,a>, <rdf:type,t> }
> > 	CEXT >= { <a,{x}>, <t,{y,z}> }
> > 	IEXT >= { <x,x>,	# link node with ``type'' a:b back to itself
> > 	          <x,y>,<y,a>,	# link node with ``type'' a:b to a:b 
> > 		  <y,z>,<z,t>,	# give the link a type of rdf:type
> > 		  <z,z> }	# provide a ``recursive'' typing for the link
> > 
> > My revised message gives a larger example of this.
> 
> Yes it does.  My point was that any model of RDF/XML with a typed node must have 
> this stuff and I couldn't figure out where that constraint was specified.

It is a consequence of the relationships between CEXT and rdf:type in
interpretations.

> [...]
> 
> > <rdf:Description rdf:type="a:b"/> is not valid RDF because a:b is not a
> > literal, not (just) because QNames are not allowed as attrib values. 
> 
> This is a nit, but the rdf:type attribute is a special case in RDF.  The value 
> of the attribute is the URI of a resource, not a literal.  You may wish to add 
> that to your list of mistakes in RDF.

Precisely.  Therefore RDF can't use the abbreviation you wanted.  (I don't
consider this to be a big mistage, if it is one at all.)

[...]

> > What do you loose?  Well you do loose 
> > 1/ rdf:parseType (unrecoverable),
> > 2/ part of the difference between classes and properties,
> > 3/ bagID, 
> > 4/ the strange use of rdf:ID on property elements (unrecoverable),
> > 5/ the strange part of the second syntax abbreviation (unrecoverable), 
> > 6/ the special treatment of rdf:li, 
> > 7/ aboutEach, 
> > 8/ and the type and number restrictions on statements.
> > 
> > I maintain that most of the above are mistakes in RDF.  (Some of them are not handled in
> > Pat's model theory.  Some of them are causing controversy in the RDF Core WG.)
> 
> Losing all that though, puts it beyond anything RDFCore can use within its 
> current charter.

Probably, but so be it.  Some of this can be recovered, including,
probably, all of the stuff not marked above as unrecoverable.

> Brian

peter

PS:  My goal here is not to come up with something that the RDF Core WG can
use, but is instead much more like coming up with something that makes RDF
palatable to XML people.

Received on Sunday, 21 October 2001 15:41:27 UTC