Re: Abstract syntax: an attempt from Graham Klyne on 2001-06-26 (w3c-rdfcore-wg@w3.org from June 2001)

From: Graham Klyne <Graham.Klyne@Baltimore.com>
Date: Tue, 26 Jun 2001 07:25:31 +0100
To: pat hayes <phayes@ai.uwf.edu>
Cc: w3c-rdfcore-wg@w3.org
Message-Id: <5.0.2.1.2.20010626065241.00a3b860@joy.songbird.com>
At 11:22 PM 6/23/01 -0500, pat hayes wrote:
>>What I am trying to do here is present an abstract syntax that is not 
>>tied to any surface syntax, and which I think could be a basis for 
>>hanging some formal semantics onto.
>
>Maybe Im not following the local usage of 'abstract', but in the sense 
>with which I am familiar, one can't give an *abstract* syntax using BNF. 
>What you describe is a simple surface syntax (not a bad thing to have, 
>mind you  :-)

You're right... I seem to have misappropriated the term.  Maybe "simplified 
syntax" would have been less confusing.  (I have a vague recollection of 
"abstract syntax" being used in the sense I used it from some lectures, 
many years ago, on programming language semantics;  but on re-checking my 
sources I don't find that expression actually used.)

I meant "abstract syntax" in the following senses:
- not tied to single specific surface syntax
- not intended to be used as recipe for parsing a sentence
- intended to capture the just the "meaningful" structure of a sentence, 
with a minimum of lexical artefact.

It is in the nature of BNF that it defines parse tree structures for a 
linear sequence of symbols, so I suppose my usage is not "abstract" in the 
sense that it effectively defines a linear form.

As stated, the syntax is ambiguous;  there may be multiple possible parse 
trees for some given sequences of terminal symbols.  For my purposes, this 
is not a problem as it is the parse tree that is the interesting 
structure.  One can easily add punctuation symbols to make the parse 
unambiguous.

>>As presented, this syntax aims to capture RDF facts and reifications, and 
>>nothing else:  this is what I would regard as the RDF "core" syntax.  I 
>>would anticipate RDF "extensions" being presented as alternative 
>>productions for 'S'.
>>
>>NOTE:  "reification" is deliberately called out as a distinct syntax 
>>production, so that there is a place to hang the semantics that 
>>distinguish it from any other collection of facts.
>
>Hmm. Surely those would be hung on rdf:subject, rdf:object and so on, 
>rather than on the reification itself? After all, part of the point of 
>reifications is that they *arent* a distinct syntax, but in fact fit 
>naturally into the RDF triples model. In your BNF, any reification can be 
>parsed as a G of As; and that shouldn't be an error, seems to me.

A point of this was to test that assumption about "reification":  to 
present it as something that has a distinguished place in the overall 
language of RDF.  I wanted to explore the idea that treating "reification" 
as a distinct syntactic class, it might be possible to define a semantics 
that allowed for extensibility.

>>There is some syntactic ambiguity here that needs to be resolved at some 
>>level;  e.g. adjusting the abstract syntax so that rdf:subject, 
>>rdf:object, rdf:predicate can appear *only* in a production for R (and 
>>not for A).
>
>Is that in fact a requirement in the M&S ? I thought that partial 
>reifications were kosher.

Well, I think so too.

One approach would be to require a complete quad if reification semantics 
(whatever they may be) are to be applicable.  Individual elements of a quad 
would be "mere" statements.

>>Terminal symbols:
>>
>>  N : Nodes      (may be represented by Qnames or URIs)
>>  L : Literals   (may be represented by strings, data: URIs or arbitrary 
>> XML elements)
>>  P : Properties (may be represented by Qnames or URIs)
>
>Is there any requirement that N and P be disjoint?

Not as stated.

>>  rdf:type       (distinguished member of Properties)
>>  rdf:subject    (distinguished member of Properties)
>>  rdf:object     (distinguished member of Properties)
>>  rdf:predicate  (distinguished member of Properties)
>>
>>  rdf:Statement  (distinguished member of Nodes)
>>
>>  [ ]            ("punctuation" literals)
>>
>>
>>Nonterminal symbols:
>>
>>  G : Graphs             (distinguished symbol of this syntax)
>
>Slightly misleading, since BNF gives the triples a 'lexical' ordering that 
>wouldnt be in the actual graph. Might be better to use a term like D for 
>'document' to avoid confusion.

Yes.

>>  S : Simple expressions (currently: "triple" or "reification")
>>  R : Reifications       (description of a Node that denotes a statement)
>>  A : Assertions         (generic ground fact, expressed as triple)
>>  V : Values             (nodes or literals)
>>
>>
>>Productions:
>>
>>::=    denotes a production in the syntax metalanguage,
>>|      denotes alternative productions in the syntax metalanguage,
>><NULL> is a placeholder for an empty sequence of symbols
>>
>>
>>  G ::= S | G G | <NULL>
>
>Should that be ::= S | S G |<NULL> ? Your way allows a set of N triples to 
>have O(N|2) alternative parsings

I was not concerned about the parsing ambiguity here (see above).  On what 
I've looked at so far it makes no difference to the semantics which parse 
tree one chooses.

>>  S ::= R | A
>>
>>  R ::= [ N rdf:type rdf:Statement ]
>>        [ N rdf:predicate P ]
>>        [ N rdf:subject N ]
>>        [ N rdf:object V ]
>>
>>  A ::= [ N P V ]
>>
>>  V ::= N | L
>
>I'd suggest a simpler BNF along the same lines, but without using R:
>
>D ::= A*
>A ::= [ N P V ]
>V ::= N | L
>
>and then if you want R, define it as a subcase of D.

Well, prompted by an earlier exchange with you, I was trying to express a 
*syntax* of RDF that was richer than the mere *triple structure*.  Maybe I 
misunderstood what you were saying, but it made sense to me (and is also 
present in the model theoretic semantics for DAML+OIL) to recognize 
specific collections or patterns of triples as distinguished syntactic 
elements.

My goal here was to apply such an approach to the very simple case of RDF 
assertions + "reification".

I think I am trying to present "reification" here as something different 
from its normal meaning in logical systems;  more a device for "nesting" 
than "quotation".

#g


------------------------------------------------------------
Graham Klyne                    Baltimore Technologies
Strategic Research              Content Security Group
<Graham.Klyne@Baltimore.com>    <http://www.mimesweeper.com>
                                 <http://www.baltimore.com>
------------------------------------------------------------
Received on Tuesday, 26 June 2001 07:15:44 UTC