- From: John Lumley <john@saxonica.com>
- Date: Wed, 23 Jul 2025 15:06:46 +0100
- To: ixml <public-ixml@w3.org>
- Message-ID: <3a8fc79f-5ab5-409d-95ef-267afbdb6210@saxonica.com>
During some discussion yesterday about the possibility of running an Earley parser in a reverse direction across the input string, I suggested that by 'reversing' an iXML grammar and running across a reversal of the input text, you should get a parse tree which is a mirror image of the normal parse tree. Using the following steps on an iXML grammar: 1. Children of only an *alt *are reversed in order 2. the characters of *@string* of a *literal* are reversed produces a new grammar. E.g. for the *address* sample: address: person, lf, street, lf, postcode, city, lf, country, lf; person, lf, street, lf, city, postcode, lf, country, lf. person: (title, S?)?, (initials; given, S), surname, S?. title: "Mr."; "Mrs."; "Dr."; "Ms.". initials: initial+. initial: LETTER, ".", S?. surname: name. given: name. -name: LETTER, letters. street: no, S?, streetname; streetname, S?, no, S?. streetname: name, S; name, S, name. city: name, S; name, S, name, S. country: name, S?; name, S, name, S?. postcode: digits, S, LETTER, LETTER, S?; LETTER, LETTER, digits, S, digit, LETTER, LETTER, S?. no: digits. -LETTER: ["A"-"Z"]. -letters: ["a"-"z"]*. -digit: ["0"-"9"]. -digits: ["0"-"9"]+. -S: " "+. -lf: -#a | -#d, -#a . we get a new grammar: address: lf, country, lf, city, postcode, lf, street, lf, person | lf, country, lf, postcode, city, lf, street, lf, person. person: S?, surname, (initials; S, given), (S?, title)?. title: ".rM"; ".srM"; ".rD"; ".sM". initials: initial+. initial: S?, ".", LETTER. surname: name. given: name. -name: letters, LETTER. street: streetname, S?, no | S?, no, S?, streetname. streetname: S, name | name, S, name. city: S, name | S, name, S, name. country: S?, name | S?, name, S, name. postcode: S?, LETTER, LETTER, S, digits | S?, LETTER, LETTER, digit, S, digits, LETTER, LETTER. no: digits. -LETTER: ["A"-"Z"]. -letters: ["a"-"z"]*. -digit: ["0"-"9"]. -digits: ["0"-"9"]+. -S: " "+. -lf: -#a | -#a, -#d. Running /backwards /over the sample: Steven Pemberton 21 Sandridge Road St Albans AL1 4BY United Kingdom Produces a tree: <address> <country>modgniK detinU</country> <postcode>YB4 1LA</postcode> <city> snablA tS</city> <street> <streetname>daoR egdirdnaS</streetname> <no>12</no> </street> <person> <surname>notrebmeP</surname> <given>nevetS</given> </person> </address> which is an exact mirror image (document order, modulo attribute order, reversed *text()|attribute()*) of the normal result: <address> <person> <given>Steven</given> <surname>Pemberton</surname> </person> <street> <no>21</no> <streetname>Sandridge Road</streetname> </street> <city>St Albans </city> <postcode>AL1 4BY</postcode> <country>United Kingdom</country> </address> This suggests that programmatic exploration of a bidirectional Earley parser (working from both ends towards a join somewhere in the middle) could be possible. -- *John Lumley* MA PhD CEng FIEE john@saxonica.com
Received on Wednesday, 23 July 2025 14:07:02 UTC