what I learned from today's discussion of delimiters from C. M. Sperberg-McQueen on 2022-01-26 (public-ixml@w3.org from January 2022)

From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
Date: Tue, 25 Jan 2022 21:26:50 -0700
To: public-ixml@w3.org
Message-ID: <87mtjjf7g5.fsf@blackmesatech.com>
Reflecting on today's discussion of delimiters, it occurs to me that it
might be worth identifying some of the things that I think became clear
today.

(1) First, I think everyone agreed more or less clearly on the principle
    that pragmas and comments should be grammatically distinct (as they
    are both in the pragmas proposal Tom Hillman and I put forward and
    in Steven Pemberton's 'strawman' sketch).  That's worth something.

(2) From that, it follows that by any plausible measure of complexity or
    work, the choice of delimiters for pragmas will have no appreciable
    effect on the complexity of the ixml grammar or the amount of work a
    parser will do in parsing an ixml grammar.

The grammar will not be shorter or longer, simpler or more complex,
based on our choice of delimiters.  The size of the grammar will be
affected by our choices on the internal structure of pragmas and on
where pragmas fit into the larger structures of the ixml grammar for
ixml grammars, but not by our choice of delimiters.

Our choice of delimiters will for that reason not make the task of
parsing an ixml grammar simpler or more complex for an ixml processor:
it will have essentially the same amount of work to do regardless of the
delimiters chosen, assuming that we have the common sense to refrain
from choosing delimiters which introduce either ambiguity or garden-path
structures into the grammar.

We mentioned briefly today the idea that a processor that does not
implement any pragmas could work internally with a modified version of
the ixml grammar that did treat pragmas as a kind of comment.  This idea
does not, I think, stand up under closer examination.  To the extent
that pragmas have any internal structure, any conforming processor must
check that internal structure, since our spec requires that
nonconforming grammars be rejected.  To the extent that pragmas have no
internal structure, or a very simple one, the gain in simplicity from
using a modified grammar would be negligeable.

The proposition that things will be simpler if pragmas visually resemble
comments (using braces as the first character in their start delimiter
and the last character in their end delimiter), may be true for some
human readers, but it will not be true for the ixml grammar or for
conforming ixml processors.

(3) Neither using braces in pragma delimiters nor avoiding them will
    have any major technical effect on the spec.

As just discussed, neither choice makes a difference to the complexity
of the grammar or the parsing task.

Steven, as the leader of the pro-braces camp, declined in today's
discussion to claim that any particular technical problems would arise
if the delimiters did not use braces.

On the other side, the only technical arguments brought forward were (a)
the observation that under the braces option a typo in a pragma might
lead to its being misread as a comment, and (b) the observation that
single-character delimiters will make pragmas slightly more lightweight
and thus make them easier to use in situations that require numerous
pragmas (e.g. multiple annotations on individual symbols in the
right-hand side of a grammar rule).

Observation (a) is correct, I think, but applies only in a very small
number of cases.  Under the TM proposal, for example, a pragma might
take the form

    [a:x b c]

or (using different delimiters, since no one is holding out for
brackets)

    ¿a:x b c?

Under the SP strawman proposal example, an analogous pragma might take
the form

    {*ax b c}

Here the omission of the asterisk or its replacement with another
character (say 8) will as suggested lead to {8ax b x} being misread as a
comment and not flagged as an error.  But if the same errors (omission
or replacement) are made in either brace, the error will be detected
regardless. And if errors are made in the pragma body (e.g. the omission
of the blank between b and c, or the replacement of either with some
other character), the error will be detected only by a processor that
understands the pragma in question (and possibly not even then).  So
although I admire the ingenuity of the argument, I don't think anyone
will argue that it's conclusive.

The same is true of argument (b): it's true as far as it goes, but there
are plenty of cases where it will make no difference, and even where it
does make a difference, the difference is relatively minor and subtle.
I think it's relevant and points clearly in one direction, but it is
also far from conclusive.

(4) Unless someone brings forward a new argument that a particular
    choice of delimiters will have some substantial effect on complexity
    or efficiency or some other important property, we are left with
    what I think can fairly be called an aesthetic judgement, to be made
    on the basis of technical taste.

Aesthetic judgements are important and taste is real, but both are
notoriously resistant to decision by argument.

It may be that discussing what pragmas are for will lead us closer to
consensus, though I suppose there is a serious chance that what we will
learn is that we have the same understandings of pragmas but quite
different understandings of what it means to say that pragmas are
essentially comments, or that pragmas are essentially different from
comments.

If I understand correctly, pragmas share with comments the property that
the standard interpretation of a grammar with pragmas is the same as the
standard interpretation of that grammar without pragmas.  They differ
from comments in that they will normally be machine-processable in ways
that comments normally are not.  What is riding on the choice of
delimiters is the relative emphasis placed on these two facts:  make
them visually similar to stress the first, make them distinct to stress
the second.  Or give the grammar writer a choice of delimiters, just as
we do with the separators for left- and right-hand sides and the
separators for alternatives.

The conclusion that the choice of delimiters raises no important
technical issues, only an important but possibly ineffable aesthetic
issue, makes me feel mildly optimistic.



-- 
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com
Received on Wednesday, 26 January 2022 04:27:09 UTC