- From: James Fuller <jim@webcomposite.com>
- Date: Mon, 24 Sep 2012 14:43:39 +0200
- To: James Clark <jjc@jclark.com>
- Cc: public-microxml@w3.org
On Mon, Sep 24, 2012 at 2:38 PM, James Clark <jjc@jclark.com> wrote: > REx looks quite cool. Did you have to modify the grammar in the spec at all > to get REx to accept it? only slightly, and rearrange things to fit what is required by REx (though I think I need to understand a bit more how REx works with whitespace def). > There are very few requirements that aren't expressed in the syntax: > > - name in end-tag must match name in start-tag > - no duplicate attributes > - referent of a numeric character ref must match char production good points! > What difference exactly in the behaviour of the parser does <?TOKENS?> make? The preceding <?TOKENS?> is the syntax ( parser rules) which is subject to LL(K) parser generation. The part following <?TOKENS?> is the 'lexer definition', which goes into a DFA construction. The following constructs are allowed in lexer definition: - character codes, e.g. #xFEFF - character sets, e.g. [0-9a-fA-F] - "subtraction", e.g. char - ('<'|'&'|'>') - lexical lookahead ("&" and "\\" operators) - token preference definitions ("<<" and ">>" operators) As per Gunther, he states that the lexer definition does not support; - recursive rules (because of the DFA) - the "ordered choice" operator: "/" Yes, REx is cool … Jim Fuller
Received on Monday, 24 September 2012 12:44:12 UTC