BNF (and XML) grammar extensibility

Following up on today's discussion at the telecon, let me explain what I
think is a fundamental issue with respect to the BNF/XML syntax and why I
am against a quick-and-dirty fix to the current BNF/XML grammars.

We need to make sure that the XML syntax of BLD is compatible with the
future possible dialects that extend BLD, i.e., that BLD XML docs will be
valid rulesets in, say, some logic programming dialect that extends
BLD. The only sure way to achieve this is to define XML for FLD and show
how XML for BLD can be obtained by restriction.

Since XML syntax is a bit too unwieldy, I think BNF is a good way to try
the idea.

Jos' proposal was a good start, but it had problems: Some productions were
redundant and the names of the nonterminals were inappropriate (and some
redefined in BLD).

Since we derive XML from BNF (a good thing IMO), capitalized names of the
nonterminals (like "Rule") will end up in XML tags. XML tags should be
mnemonically natural for the dialect at hand. So, in BLD we should have
"Rule" and "Ruleset", while in FLD (and a future FOL) it should be
"Formula" and "Formulaset". In some dialects there are constraints, which
cannot be properly called "rules".  The question is how do we reconcile
these requirements.

The following rough sketch of my idea borrows a trick from how subtypes are
commonly defined and introduces this concept into the definition of BNF
(under the name of "specialization").
I am not a hacker of XML Schema, but I think that in XML this subtyping can
be implemented using XSD import and restrictions.

Back to BNF. Let's say that a nonterminal NT' is a specialization of another
nonterminal, NT, defined by a production NT := Alt_1 | ... | Alt_n, if

  a. NT' is one of the  Alt_1 | ... | Alt_n; or
  b. NT' is defined by production

       	  NT'  := Alt'_1 | ... | Alt'_k

     and for each Alt'_j there is Alt_i such that Alt'_j is a
     specialization or Alt_i or Alt'_j = Alt_i.

I attempted to sketch a grammar that is based on these idea in the
attachment. This may need some refinement. For example, Rule is not quite a
specialization of Formula because 'Rule(' and 'Formula(' do not match.
But this is the only small problem, I think. It can be worked out by either
changing 'Rule(' to 'Formula(' or by changing the definitions. (These
'Formula(' and 'Rule(' will not show up in XML anyway.)

Here is a modification of Jos' modification of Harold's based on the above
ideas. The purpose is to illustrate the idea -- I am not married to the
particular names or style, and there might be a simpler way.


FLD Grammar:

   Formulaset     ::= 'RIFSet(' absolute-IRI? Metadata* Formula* ')'
   Formula        ::= 'Formula(' FORMULACONTENT ')'

   FORMULACONTENT ::= 'And' '(' FORMULACONTENT* ')' |
                      'Or' '(' FORMULACONTENT* ')' |
		      FORMULACONTENT ':-' FORMULACONTENT |
                      'Exists' Var+ '(' FORMULACONTENT ')' |
                      'Forall' Var+ '(' FORMULACONTENT ')' |
                      'Neg' FORMULACONTENT |
                      'Naf' FORMULACONTENT |
                      ATOMIC

   ATOMIC         ::= Predicate | Equal | Member | Subclass | Frame
   Predicate      ::= UNITERM | 'Builtin ( ' UNITERM ' ) '
   Equal          ::= TERM '=' TERM
   Member         ::= TERM '#' TERM
   Subclass       ::= TERM '##' TERM
   Frame          ::= TERM '[' (TERM '->' TERM)* ']'

   TERM           ::= Const | Var | Function| Equal | Member | Subclass | Frame
   Function       ::= UNITERM | 'Builtin ( ' UNITERM ' ) '

   UNITERM        ::= TERM '(' (TERM* | (LITERAL '->' TERM)*) ')'

   Metadata       ::= ' Metadata ( ' METADATALIST ' ) '
   METADATALIST   ::= absolute-IRI MetadataValue | METADATALIST ' ; ' METADATALIST
   METADATAVALUE  ::= Const | ' [] ' | ' [ ' METADATALIST ' ] '

   Const          ::= '"' LITERAL '"^^' SYMSPACE
   Var            ::= '?' LITERAL
   SYMSPACE       ::= absolute-IRI




BLD grammar:

  CONDITION and RULECONTENT are specializations of FORMULACONTENT
  (ATOMIC is also a specialization of FORMULACONTENT)
  Rule is a specializations of Formula
  Ruleset is a specialization of Formulaset
  BLDTERM is a specialization of TERM
  BLDUNITERM is a specialization of UNITERM
  BLDATOMIC is a specialization of ATOMIC
  BLDPredicate, BLDEqual, BLDMember, BLDSubclass, BLDFrame are specializations
            of Predicate, Equal, Member, Subclass, Frame, respectively.

  "Specialization" here means the following. A nonterminal NT' is a
  specialization of another nonterminal, NT, if

    NT := Alt_1 | ... | Alt_n

  and either NT' is one of the Alt_i's or

    NT'  := Alt'_1 | ... | Alt'_k

  and for each Alt'_j there is Alt_i such that Alt'_j is a
  specialization of Alt_i or Alt'_j = Alt_i.
  This is similar to the definition of subtyping.


   Ruleset        ::= 'RIFSet(' absolute-IRI? Metadata* Rule* ')'
   Rule           ::= 'Rule(' absolute-IRI? Metadata* RULECONTENT ' ) '
   RULECONTENT    ::= 'Forall' Var+ '(' BLDATOMIC (':-' CONDITION)? ')' | BLDATOMIC (':-' CONDITION)?

   CONDITION      ::= 'And' '(' CONDITION* ')' |
                      'Or' '(' CONDITION* ')' |
                      'Exists' Var+ '(' CONDITION ')' |
                      BLDATOMIC

   BLDATOMIC      ::= BLDPredicate | BLDEqual | BLDMember | BLDSubclass | BLDFrame
   BLDPredicate   ::= BLDUNITERM | 'Builtin ( ' BLDUNITERM ' ) '
   BLDEqual       ::= BLDTERM '=' BLDTERM
   BLDMember      ::= BLDTERM '#' BLDTERM
   BLDSubclass    ::= BLDTERM '##' BLDTERM
   BLDFrame       ::= BLDTERM '[' (BLDTERM '->' BLDTERM)* ']'

   BLDTERM        ::= Const | Var | Function
   BLDFunction    ::= BLDUNITERM | 'Builtin ( ' BLDUNITERM ' ) '

   BLDUNITERM     ::= Const '(' (BLDTERM* | (LITERAL '->' BLDTERM)*) ')'

   // from here on the grammar is the same as in FLD
   Metadata       ::= ' Metadata ( ' METADATALIST ' ) '
   METADATALIST   ::= absolute-IRI METADATAVALUE | METADATALIST ' ; ' METADATALIST
   METADATAVALUE  ::= Const | ' [] ' | ' [ ' METADATALIST ' ] '

   Const          ::= '"' LITERAL '"^^' SYMSPACE
   Var            ::= '?' LITERAL
   SYMSPACE       ::= absolute-IRI

Received on Tuesday, 4 March 2008 18:29:52 UTC