Surface vs. Abstract Syntax, was: RE: What do the ontologists want from Jonathan Borden on 2001-05-18 (www-rdf-logic@w3.org from May 2001)

From: Jonathan Borden <jborden@mediaone.net>
Date: Fri, 18 May 2001 07:56:07 -0400
To: "Peter Crowther" <peter.crowther@networkinference.com>, "'Sergey Melnik'" <melnik@db.stanford.edu>
Cc: <www-rdf-logic@w3.org>
Message-ID: <001601c0df91$85e8c390$0201a8c0@ne.mediaone.net>
Peter Crowther wrote:

>
> > From: Sergey Melnik [mailto:melnik@db.stanford.edu]
> [...]
> > can anyone criticizing reification suggest a more
> > suitable mechanism
> > for handling the aforementioned features that makes both
> > programmers and logicians happy?
>
> Ditch RDF and layer a logic directly on XML?  Just a thought...
> the problem
> is that it loses a lot of the work currently being put into the
> Semantic Web
> and being described using RDF, unless there's a well-defined
> migration path.
> But it would give much more flexible structures and a far simpler way of
> denoting what has formally defined semantics versus what is simply a data
> structure.

Alot is being said in this thread (this message is just a convenient one to
respond to) about dissatisfaction with the RDF syntax -- and then there are
responses and responses to responses -- so I want to remind people about the
difference between the _surface syntax_ as pat hayes describes what is
termed the "RDF (XML) syntax", and the _abstract syntax_ what is termed the
"RDF Model" in RDF 1.0 M & S.

There have been several suggestions to modify the surface RDF syntax,
notably N3, as well Sergey and I have proposed simplified RDF syntaxes e.g.
http://www.openhealth.org/RDF/rdf_Syntax_and_Names.htm with an
implementation http://www.openhealth.org/RDF/rdfExtractify.xsl which has the
property of interpreting _arbitrary_ XML as RDF triples.

What then is the relationship between the abstract syntax of XML and the
abstract syntax of RDF. This sort of topic has been long discussed in the
markup world as
"Groves" (for intro see: http://www.prescod.net/groves/shorttut/ , for full
XML grove see: http://www.openhealth.org/XSet).

Essentially the 'full' grove of a document describes every niggley detail
including whitespace, whether attributes are quoted using single or double
quotes and what the order of attributes is. These details are deemed
unimportant to XML applications. The XML Infoset
http://www.w3.org/TR/xml-infoset/ serves to specify a common _subset_ of the
full XML abstract syntax that is useful to most XML software.

The XML Infoset (read XML abstract syntax) represents order of child
elements (but not order of attributes) whether an information item arises
from an attribute or elemenet etc.

The RDF abstract syntax can be viewed as a further bare subset of the XML
infoset where element order is not represented. This allows an RDF abstract
syntax to be represented by a single relational table (p,s,o) where an XML
infoset is represented by a more complex structure e.g. a DOM. So the
advantage of RDF is the simplicity of storage, yet to represent element
order (e.g. a container) additional constructs are added which make life
difficult (containers). In XML every element is naturally a container and
containment is a very natural part of the abstract (and surface) syntax.

How does N3 relate? N3 is simply an alternate _surface syntax_ that is
easier for humans to write, it does not change the _abstract syntax_.

>
> The alternative appears to be to accept that RDF will be used as a very,
> very verbose encoding of LISP cons cells; and that some part of those
> structures might be used to represent something formal, but that a large
> part will straight data structure, or be glue that could be encoded and
> processed more easily using a richer syntax.
>
> 		- Peter
>
> [insert back view of Peter running down infinitely long corridor
> towards the
> end marked "RDF Logic, Holy Grail, World Peace and Emergency Exit" pursued
> by mixed crowd of logicians and RDF enthusiasts waving pitchforks and
> torches]

Yes this would be very very painful, using triples merely to represent
lists. One possibility is to introduce containers in a more natural fashion,
perhaps using a canonical triple ordering (e.g.
http://www.openhealth.org/RDF/RDFmediatype.html). Another would be to bite
the bullet and make containment and ordering a natural feature of the RDF
abstract syntax. The current container mechanism _is_ painful.

Jonathan Borden
The Open Healthcare Group
http://www.openhealth.org
Received on Friday, 18 May 2001 07:57:01 UTC