- From: Steven Pemberton <steven.pemberton@cwi.nl>
- Date: Tue, 16 Apr 2024 12:00:45 +0000
- To: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
- Cc: public-ixml@w3.org
Ooh, I'm sorry Michael. I know my comments on documents can be a bit terse.
I hope they didn't sound unfriendly; it wasn't the intention.
I saw your email come in just before I switched off for bed. I'm glad I
didn't read it then. Maybe I won't read it at all...
Steven
On Tuesday 16 April 2024 02:21:21 (+02:00), C. M. Sperberg-McQueen wrote:
> My comments, in turn, elicit a further remark or two.
>
> Re-reading what I sent earlier today, I am a little disappointed by the
> tetchiness of my response to Steven. My apologies -- I was on edge when
> I wrote, for other reasons, and it seems to have had an unfortunate
> effect.
>
> It's clear that the current framing of the document struck Steven as
> slightly off and maybe a little off-putting -- good reason to re-think
> the framing of the document. The same is true for the details Steven
> commented on. That is, when viewed correctly, very helpful feedback.
> I'll need to think about how to address the comments, but that's not a
> reason to wave them off.
>
> Thank you, Steven. I appreciate the detailed and helpful response. I
> apologize for my earlier defensiveness.
>
> Michael
>
>
> "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com> writes:
>
> > Steven's comments are welcome; they elicit a few comments in return.
> >
> > Steven Pemberton <steven.pemberton@cwi.nl> writes:
> >
> >> https://github.com/invisibleXML/ixml/blob/master/misc/ebnf-to-bnf.md
> >>
> >> General comments:
> >>
> >> I appreciate the approach of this article, though it does reflect a
> >> different idea to mine of what is behind rewriting rules.
> >>
> >> Parsing algorithms don't need to deal with EBNF extensions, because
> >> all the extensions are only syntactic differences intended to make
> >> life easier for the author and reader, and all grammars written in
> >> EBNF can be rewritten, as this article shows, to equivalent BNF
forms.
> >
> > This all seems true enough; I'm not quite sure whether it reflects a
> > different idea about what is behind rewriting rules or not.
> >
> > What would it mean to say that parsing algorithms "need to" deal with
> > EBNF? Published algorithms assume grammars in a particular form,
> > typically BNF or something isomorphic to it. Nothing we can say or do
> > will make the editors of those journals return the manuacripts to the
> > authors asking that they revise the article to account for a different
> > input format for grammars.
> >
> > For a potential implementer of ixml, on the other hand, it seems
> > passably clear (a) what it means to "deal with" EBNF, and (b) that
they
> > "need" to do so. Rewriting the input grammar into BNF and using a
> > published algorithm for parsing BNF is one way they can perform their
> > task. Perhaps I am unnecessarily pedantic on this point, but
rewriting
> > to BNF is not the only possible approach to implementing ixml.
> >
> >> Accordingly, I consider the rewriting as equivalent to code
> >> generation, and is indeed how I treat it in my implementation.
> >
> > I'm puzzled here in part because in past discussions of ambiguity you
> > have denied that you rewrite the grammar, or that the way in which you
> > parse is equivalent to parsing against a particular BNF equivalent to
> > the input grammar.
> >
> >> So
> >> parse algorithms don't need to deal with EBNF constructs for the same
> >> reason that machine opcodes don't have to deal with WHILE loops. The
> >> parse algorithms remain simple, while the grammar is written in a
> >> higher-level language.
> >
> >> So I consider the rewrite rules purely as an implementation
> >> detail.
> >
> > Agreed there. The only requirement imposed by the ixml spec is that
the
> > parser produce the correct results. (And since the meaning of ixml
> > constructs is defined only by rather informal prose, and there is no
> > normative translation to BNF to connect it to any more formal accounts
> > of parsing, it is not always obvious what a phrase like "the correct
> > results" means in particular cases.)
> >
> >> However, the real point of this article I think is in the
> >> conclusion, that some rewrites are faster than others, and
> >> implementers should experiment.
> >
> > Yes. If it were not for the fact that the choice of rewrite rules has
> > pragmatic consequences which have proved non-obvious to multiple
> > implementers, I don't think this document would exist.
> >
> >> Two points that make the article harder to read:
> >>
> >> 1) The "|" separator for alternatives was an addition made to BNF
from
> >> Chomsky's notation, and has by definition no inherent
> >> ordering. Alternative rewrites such as A: B | C. and A: C |
> >> B. confused me since they are equivalent.
> >
> > What does "equivalent" mean here?
> >
> > They recognize the same language, but they do not produce the same
BNF.
> >
> > All of the rewrites described here recognize the same language; each
of
> > them produces a different BNF.
> >
> > At the level of abstraction sometimes adopted in automata theory,
where
> > the right-hand sides of a rule form a set, not a sequence, the two
rules
> > in question are the same. But at that level of abstraction the rule
"A:
> > B | B | C." is not ambiguous (unless "A: B|C." also is), though I
> > suspect most ixml grammars will say that it is.
> >
> >> If there is a reason to give
> >> them, it should be explained, and it would be far better right at the
> >> beginning to say that there is such an equivalence, which can be
> >> applied everywhere. This would make a lot of the examples easier to
> >> read because there would be no need for this repetition.
> >
> > Is it clear that a BNF produced using the rewrite "A: B|C." and one
> > produced using "A: C|B." will always have the same pragmatic
> > characteristics? In particular: time to first parse? time to final
> > output?
> >
> > It's not clear to me.
> >
> > For a backtracking parser, the time to first parse may be quite
> > different for the two rewrites. (Or so I speculate, on the basis of a
> > little experience with backtracking parsers.) For an Earley parser,
> > probably not.
> >
> >> 2) There is a implication (for instance in (4a) that bracketing is
BNF
> >> ("The following BNF grammar ... S = ('a' | -S), 'b'.) However,
> >> bracketing is EBNF, and later examples of how to deal with it are
> >> given, so any rewrites using bracketing are only EBNF -> EBNF, and
> >> this should be made clearer in the text.
> >
> > I think you mean O4a (with a capital O, not a zero), and thank you,
good
> > catch. I think the example should probably be rewritten as
> >
> > S = 'a', 'b' | -S, 'b'.
> >
> >> Detailed Comments
> >
> >> "textual insertions" This is an odd thing, because ixml grammars
> >> specify both input and output, and I don't consider insertions, or
> >> marks for that matter, as part of the grammar, but part of the output
> >> specification.
> >
> > I think that in an ixml context, the term 'grammar' will for many
> > readers mean whatever is in the input grammar submitted to the
> > processor, and not just the subset of it that determines whether input
> > is accepted or not. Your sentence illustrates the ambiguity: the
first
> > occurrence of 'grammar' and the second occurrence have different
senses.
> >
> >> Therefore I don't think of them as extensions to EBNF
> >> in the sense suggested here. However, I do agree that repetition with
> >> separators are an addition to EBNF. "Nested choices". I think this is
> >> what I mean by bracketing, but in any case an example would help
here.
> >>
> >> (O1a)(O1b) This is an example of giving the same expansion twice as
if
> >> | isn't commutative. (PP2) The sentence is missing something.
> >>
> >> Steven
>
>
Received on Tuesday, 16 April 2024 12:00:58 UTC