Re: two objections to the pragmas proposal from Tomos Hillman on 2022-01-19 (public-ixml@w3.org from January 2022)

From: Tomos Hillman <yamahito@gmail.com>
Date: Wed, 19 Jan 2022 11:10:40 +0000
To: public-ixml@w3.org, "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
Message-ID: <0a4a7575-f2d4-40ed-b692-ddfa6957612e@Spark>
I think there is a laudable goal for simplicity in Invisible XML.  Simple, beautiful ideas have fewer moving parts, and often the most elegant solutions are the best.  However, if a beautiful idea is not actually a solution, then the point is moot.

Perhaps namespaces feel inelegant.  But the solution feels incomplete without them.

After Sydney J. Harris, Roger Sessions, and Albert Einstein:


In every field of inquiry, it is true that all things should be made as simple as possible – but no simpler. (And for every problem that is muddled by over-complexity, a dozen are muddled by over-simplifying.)*

*https://quoteinvestigator.com/2011/05/13/einstein-simple/

Thanks,
Tom
On 18 Jan 2022, 18:54 +0000, C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>, wrote:
> During the call earlier today, Steven Pemberton summarized his
> objections to the pragmas proposal put forward by Tom Hillman and me in
> three points, of which I unfortunately remember only two: it's too
> complicated and it goes beyond its remit. I'd like to address these two
> issues, if I can do so usefully.
>
> (1) On the issue of our remit, I think Tom and I have already answered
> the objection. We did not see a way to make pragmas work well without
> some mechanism for distributed naming, so we faced the choice between
> making a proposal for distributed naming part of the pragmas proposal or
> not making a pragmas proposal.
>
> Those who believe distributed naming is not necessary in order to
> satisfy the requirements Tom and I identified are welcome to make their
> case that it's not necessary, or that our requirements are too
> stringent. But so far I have not seen anyone making either of those
> cases.
>
> The proposal we made for distributed naming was to reuse the QName
> mechanism now familiar to pretty much anyone who uses XML seriously,
> including people who avoid namespaces wherever possible. In principle,
> as we said last week, any other mechanism would do as well. Having
> thought about it in the last week, I think SP's strawman proposal
> persuaded me that QNames are really the only plausible solution in an
> XML context, because they are familiar and well understood. Any other
> mechanism will elicit the question from users "why didn't you just use
> QNames?"
>
> I understand that not everyone in the group thought, when Tom and I took
> an action to develop a pragmas proposal, that it would also entail a
> proposal for namespaces or something like them. But our remit was to
> produce a workable proposal for pragmas; I think that any workable
> pragmas proposal requires a workable proposal for QNames. As I say,
> anyone in or outside the group is welcome to explain why it doesn't.
>
>
> (2) On the issue of complication, I would most of all like a bit more
> specificity. It's hard to answer so vague and sweeping an objection,
> and I am reduced to guessing which parts of the proposal people think
> are too complicated.
>
> Judging SP's strawman proposal as a baseline level of complexity (and
> using the names Tinman (TM) and Strawman (SM) for brevity, I think I see
> some areas in which SM is simpler than TM, some in which it's more
> complex, and a number of areas where the changes don't seem to make any
> significant difference.
>
> - Several ways in which SM differs from TM appear to me irrelevant to
> questions of complexity -- that is, they neither make things simpler
> nor make them more complicated. Among these I would list:
>
> . The change in delimiters.
> . The prohibition on empty comments.
> . Allowing empty processor specifications.
> . Forbidding blanks but not newline, tab, or other whitespace
> characters within the processor specification.
> . Requiring blank and not allowing newline, tab, or any other other
> whitespace to separate the processor specification from the
> pragma body.
> . Defining the XML form of pragmas as having mixed content rather
> than element content.
>
> (These are all things I think of as weaknesses in SM, but none seems
> to be intended as a simplification.)
>
> - In SM, pragmas have a slightly more complicated internal structure
> than in TM, since processors are required to recognize comments and
> pragmas embedded in pragmas.
>
> (I think, by the way, that this is a design error and contradicts the
> principle that "The structure of body text of any pragma is defined by
> the processor it is addressed to." A better design allows the pragma
> delimiters to occur within a pragma, but without requiring that when
> they are encountered they define a syntactically legal pragma. And
> ditto for comments. But that is not directly relevant to the question
> of complexity.)
>
> - In SM, pragmas are allowed wherever whitespace and comments are
> allowed, which reduces complexity as measured by the number of changes
> to the grammar.
>
> I wonder if the other changes to the grammar in TM are what SP has in
> mind when he says it's "too complicated"; I suspect it is.
>
> On the other hand, as far as I can tell SM is more complex to use for
> the grammar writer, especially but not exclusively the grammar writer
> who cares about the XML form of the grammar.
>
> The reason for TM's design in this area is that in every use case
> anyone has reported for pragmas, the pragma can be understood as an
> annotation on a symbol in the right-hand side of a rule, on a rule, or
> on the grammar itself. There may be other use cases which have
> different requirements, but so far no one has mentioned any. So TM
> reflects an attempt to make the syntax of pragmas suitable in those
> three cases.
>
> The examples of XQuery and ixml itself illustrate that quite often an
> intuitive syntax for annotating any thing puts the annotation before
> the thing. In ixml, to annotate a nonterminal with a mark, we write
> the mark before the nonterminal; the syntax of annotations and pragmas
> in XQuery similarly puts the annotation or the pragma first. Any
> discussion of attribute grammars will tend to illustrate an opposite
> tendency: the attribute value assignment rules for any grammar
> production can be viewed as annotations on the rule, but invariably
> follow the rule rather than preceding it.
>
> TM allows annotations on a symbol to occur before it, before or after
> any mark. So for a rule of the form
>
> a : ¿my:red? @b, ¿my:orange? ^c, ¿my:yellow? -d.
>
> or equivalently
>
> a : @ ¿my:red? b, ^ ¿my:orange? c, - ¿my:yellow? d.
>
> the XML form places the pragmas named my:* as children of the
> nonterminal elements:
>
> <rule name="a">
> <alt>
> <nonterminal mark="@" name="b">
> <pragma pname="my:red"/>
> </nonterminal>
> <nonterminal mark="^" name="c">
> <pragma pname="my:orange"/>
> </nonterminal>
> <nonterminal mark="-" name="d">
> <pragma pname="my:yellow"/>
> </nonterminal>
> </alt>
> </rule>
>
> Other parts of the TM proposal allow pragmas in locations where the
> XML form of the grammar will place the pragma as a child of the
> element representing the thing it annotates (rule or grammar).
>
> If there is a use case that requires that pragmas be able to occur
> as children of other elements, we need to capture it. Otherwise,
> any proposal that allows pragmas in other locations risk the charge
> of ... going outside its remit to allow things that are not part of
> the requirements and go well beyond any known use cases.
>
> In SM, by contrast to TM, pragmas can be located pretty much
> anywhere, which means the grammar writer will need a much better
> grasp of where 's' is used in the ixml grammar for ixml than I
> suspect most people even in the CG will have. Given a rule like
>
> a: @b, ^c, -d.
>
> it is not hard to see (or at least imagine) that comments and
> whitespace can occur in the locations where comments occur below:
>
> {1}a{2}: {3}@{4}b{5}, {6}^{7}c{8}, {9}-{10}d{11}.{12}
>
> I suspect that I am not the only member of the CG who would have to
> consult the grammar for ixml to know which element in the XML form
> of this rule will be the parent of each comment.
>
> If I want a comment or SM pragma placed as a child of the
> nonterminal c, which are my options? 6, 7, and 8, right? Wrong.
> If I want a comment or SM pragma to appear as a child of the 'rule'
> element, what are my options?
>
> From where I sit, the ixml grammar currently does a remarkably good
> job of keeping rules visually simple by keeping the 's' nonterminal
> out of the way; it does this in part by pushing the 's' as far down
> in the parse trees as possible. But as we have seen with the rules
> for class, for @from, and for @to, that sometimes ends up allowing
> comments in places where we don't want them. As we saw some months
> ago with the rule for ixml, it also sometimes ends up not allowing
> comment elements in the XML form of the grammar in places where we
> do want them.
>
> If we allow 's' to determine not just where whitespace and comments
> can go but also where pragmas can go, I think the treatment of 's'
> needs re-thinking from the ground up: we will be obligating
> ourselves either to a long and very tedious process of examining
> every occurrence of 's' in the grammar and thinking about where it
> should attach in the parse tree, or to waving our hands and saying a
> bit crossly "it doesn't matter!". But it does matter.
>
> I hope that explains why I am not yet persuaded that the TM proposal is
> too complicated.
>
> --
> C. M. Sperberg-McQueen
> Black Mesa Technologies LLC
> http://blackmesatech.com
>
Received on Wednesday, 19 January 2022 11:11:01 UTC