ebnf2re

Hi,

  The attached code converts an EBNF grammar into a regular expression.
It uses code from http://www.w3.org/1999/02/26-modules/User/Yacker and
its dependencies. The original grammar had

  begin_value_list ::= begin_value ( S* ';' S* begin_value_list )?

which is equivalent to

  begin_value_list ::= begin_value ( S* ';' S* begin_value )*

There should be some method to replace this automatically. Also, the
@terminals gizmo should not be in the grammar, neither should the fake
symbol (which is needed as @terminals is not allowed at the top...) and
it currently simply takes the second symbol from the grammar to convert;
the tool should probably automatically find the start symbol and offer
some option to select the start symbol you are interested in. It does
not detect circular references either...

In the end it would be good if it could approximate the regex, i.e., if
the grammar uses middle recursion which cannot be expressed in regular
expressions. This could be fixed by replacing the middle recursion with
e.g. OR'd terms or something.

The sample grammar is the begin-value-list for the begin="" attribute on
the SVG 1.1 animation elements, except that some things are derived from
SMIL 2.1 rather than SMIL Animation and some bugs in the specs had to be
fixed... Also, it uses the XML 1.1 name productions as those are much
shorter than the XML 1.0 variants.

The other attached file has the resulting regular expression after pro-
cessing with regex-opt plus my experimental Unicode support for the tool
(which shrinks the regex to 1/3 in size).

regards,
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 

Received on Monday, 9 January 2006 18:34:49 UTC