- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Mon, 09 Jan 2006 19:35:14 +0100
- To: www-archive@w3.org
- Message-ID: <uga5s1tsudg983g5ru41bn3b9nhnha197k@hive.bjoern.hoehrmann.de>
Hi, The attached code converts an EBNF grammar into a regular expression. It uses code from http://www.w3.org/1999/02/26-modules/User/Yacker and its dependencies. The original grammar had begin_value_list ::= begin_value ( S* ';' S* begin_value_list )? which is equivalent to begin_value_list ::= begin_value ( S* ';' S* begin_value )* There should be some method to replace this automatically. Also, the @terminals gizmo should not be in the grammar, neither should the fake symbol (which is needed as @terminals is not allowed at the top...) and it currently simply takes the second symbol from the grammar to convert; the tool should probably automatically find the start symbol and offer some option to select the start symbol you are interested in. It does not detect circular references either... In the end it would be good if it could approximate the regex, i.e., if the grammar uses middle recursion which cannot be expressed in regular expressions. This could be fixed by replacing the middle recursion with e.g. OR'd terms or something. The sample grammar is the begin-value-list for the begin="" attribute on the SVG 1.1 animation elements, except that some things are derived from SMIL 2.1 rather than SMIL Animation and some bugs in the specs had to be fixed... Also, it uses the XML 1.1 name productions as those are much shorter than the XML 1.0 variants. The other attached file has the resulting regular expression after pro- cessing with regex-opt plus my experimental Unicode support for the tool (which shrinks the regex to 1/3 in size). regards, -- Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de 68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Attachments
- text/plain attachment: yacker-ebnf2re.txt
- application/octet-stream attachment: begin-value-optimized.txt
Received on Monday, 9 January 2006 18:34:49 UTC