Re: The Science of Insecurity

I've been trying to persuade myself that the iXML specification grammar describes a deterministic language, but I haven't quite been able to do so yet. I'm trying to produce a version of the grammar which will work with an LR1 parser, but there are some pretty awkward corners to deal with. 


Can anyone with more knowledge of these things (Gunther? Fredrik?) put me out of my misery and tell me that they know the specification grammar does (or doesn't) recognise a deterministic language?
The grammar of ixml was originally designed to be as close as possible to LL1 *at the character level*, in order to make bootstrapping easier, and to make parsing faster, even for Earley-style parsers.


There was some compromise necessary because of the desire for names to be close to ixml names, which in hindsight I would have relaxed even more than we currently do, especially since we now have renaming. It is one of the reasons for the necessity of the RS rule, while the original design had been to make all spaces irrelevant except in strings, and means you have to look ahead one character more.


The version rule, which we hurriedly added at a late stage without proper contemplation completely messed this up, and is one of the reasons I would like to fix this while we still can.


Anyway, ignoring the version rule, ixml is as close to LL1 as we could get it.


Best wishes,


Steven




BTW




****************************************************

Dr. Bethan Tovey-Walsh

linguacelta.com <http://linguacelta.com/>

Golygydd | Editor geirfan.cymru <http://geirfan.cymru/>

Croeso i chi ysgrifennu ataf yn y Gymraeg



On 31 Jan 2025, at 15:30, Norm Tovey-Walsh <norm@saxonica.com> wrote:


Steven Pemberton <steven.pemberton@cwi.nl> writes:

Similarly, if I process a nondeterministic language with ixml, it is the language that is nondeterministic; ixml doesn't add to the nondeterminism.


Perhaps I can see what you mean. (But I might be wrong.)

I think many users who feed a grammar and an input into a processor and get some output think that “iXML did that”. If there was nondeterminism in their grammar, it was iXML that did the nondeterminism.

You might be arguing: “No, no, no, that’s not the case. The iXML processor uses the specification grammar (which we believe to be deterministic) to parse the user’s input grammar. What the user does with that output and how their implementation might use that to parse some other input string is none of our concern. That’s not iXML.”

I dunno.

                                       Be seeing you,
                                         norm

--
Norm Tovey-Walsh
Saxonica

Received on Monday, 3 February 2025 12:27:17 UTC