- From: Tomos Hillman <yamahito@gmail.com>
- Date: Wed, 19 Jan 2022 11:10:40 +0000
- To: public-ixml@w3.org, "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
- Message-ID: <0a4a7575-f2d4-40ed-b692-ddfa6957612e@Spark>
I think there is a laudable goal for simplicity in Invisible XML. Simple, beautiful ideas have fewer moving parts, and often the most elegant solutions are the best. However, if a beautiful idea is not actually a solution, then the point is moot. Perhaps namespaces feel inelegant. But the solution feels incomplete without them. After Sydney J. Harris, Roger Sessions, and Albert Einstein: In every field of inquiry, it is true that all things should be made as simple as possible – but no simpler. (And for every problem that is muddled by over-complexity, a dozen are muddled by over-simplifying.)* *https://quoteinvestigator.com/2011/05/13/einstein-simple/ Thanks, Tom On 18 Jan 2022, 18:54 +0000, C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>, wrote: > During the call earlier today, Steven Pemberton summarized his > objections to the pragmas proposal put forward by Tom Hillman and me in > three points, of which I unfortunately remember only two: it's too > complicated and it goes beyond its remit. I'd like to address these two > issues, if I can do so usefully. > > (1) On the issue of our remit, I think Tom and I have already answered > the objection. We did not see a way to make pragmas work well without > some mechanism for distributed naming, so we faced the choice between > making a proposal for distributed naming part of the pragmas proposal or > not making a pragmas proposal. > > Those who believe distributed naming is not necessary in order to > satisfy the requirements Tom and I identified are welcome to make their > case that it's not necessary, or that our requirements are too > stringent. But so far I have not seen anyone making either of those > cases. > > The proposal we made for distributed naming was to reuse the QName > mechanism now familiar to pretty much anyone who uses XML seriously, > including people who avoid namespaces wherever possible. In principle, > as we said last week, any other mechanism would do as well. Having > thought about it in the last week, I think SP's strawman proposal > persuaded me that QNames are really the only plausible solution in an > XML context, because they are familiar and well understood. Any other > mechanism will elicit the question from users "why didn't you just use > QNames?" > > I understand that not everyone in the group thought, when Tom and I took > an action to develop a pragmas proposal, that it would also entail a > proposal for namespaces or something like them. But our remit was to > produce a workable proposal for pragmas; I think that any workable > pragmas proposal requires a workable proposal for QNames. As I say, > anyone in or outside the group is welcome to explain why it doesn't. > > > (2) On the issue of complication, I would most of all like a bit more > specificity. It's hard to answer so vague and sweeping an objection, > and I am reduced to guessing which parts of the proposal people think > are too complicated. > > Judging SP's strawman proposal as a baseline level of complexity (and > using the names Tinman (TM) and Strawman (SM) for brevity, I think I see > some areas in which SM is simpler than TM, some in which it's more > complex, and a number of areas where the changes don't seem to make any > significant difference. > > - Several ways in which SM differs from TM appear to me irrelevant to > questions of complexity -- that is, they neither make things simpler > nor make them more complicated. Among these I would list: > > . The change in delimiters. > . The prohibition on empty comments. > . Allowing empty processor specifications. > . Forbidding blanks but not newline, tab, or other whitespace > characters within the processor specification. > . Requiring blank and not allowing newline, tab, or any other other > whitespace to separate the processor specification from the > pragma body. > . Defining the XML form of pragmas as having mixed content rather > than element content. > > (These are all things I think of as weaknesses in SM, but none seems > to be intended as a simplification.) > > - In SM, pragmas have a slightly more complicated internal structure > than in TM, since processors are required to recognize comments and > pragmas embedded in pragmas. > > (I think, by the way, that this is a design error and contradicts the > principle that "The structure of body text of any pragma is defined by > the processor it is addressed to." A better design allows the pragma > delimiters to occur within a pragma, but without requiring that when > they are encountered they define a syntactically legal pragma. And > ditto for comments. But that is not directly relevant to the question > of complexity.) > > - In SM, pragmas are allowed wherever whitespace and comments are > allowed, which reduces complexity as measured by the number of changes > to the grammar. > > I wonder if the other changes to the grammar in TM are what SP has in > mind when he says it's "too complicated"; I suspect it is. > > On the other hand, as far as I can tell SM is more complex to use for > the grammar writer, especially but not exclusively the grammar writer > who cares about the XML form of the grammar. > > The reason for TM's design in this area is that in every use case > anyone has reported for pragmas, the pragma can be understood as an > annotation on a symbol in the right-hand side of a rule, on a rule, or > on the grammar itself. There may be other use cases which have > different requirements, but so far no one has mentioned any. So TM > reflects an attempt to make the syntax of pragmas suitable in those > three cases. > > The examples of XQuery and ixml itself illustrate that quite often an > intuitive syntax for annotating any thing puts the annotation before > the thing. In ixml, to annotate a nonterminal with a mark, we write > the mark before the nonterminal; the syntax of annotations and pragmas > in XQuery similarly puts the annotation or the pragma first. Any > discussion of attribute grammars will tend to illustrate an opposite > tendency: the attribute value assignment rules for any grammar > production can be viewed as annotations on the rule, but invariably > follow the rule rather than preceding it. > > TM allows annotations on a symbol to occur before it, before or after > any mark. So for a rule of the form > > a : ¿my:red? @b, ¿my:orange? ^c, ¿my:yellow? -d. > > or equivalently > > a : @ ¿my:red? b, ^ ¿my:orange? c, - ¿my:yellow? d. > > the XML form places the pragmas named my:* as children of the > nonterminal elements: > > <rule name="a"> > <alt> > <nonterminal mark="@" name="b"> > <pragma pname="my:red"/> > </nonterminal> > <nonterminal mark="^" name="c"> > <pragma pname="my:orange"/> > </nonterminal> > <nonterminal mark="-" name="d"> > <pragma pname="my:yellow"/> > </nonterminal> > </alt> > </rule> > > Other parts of the TM proposal allow pragmas in locations where the > XML form of the grammar will place the pragma as a child of the > element representing the thing it annotates (rule or grammar). > > If there is a use case that requires that pragmas be able to occur > as children of other elements, we need to capture it. Otherwise, > any proposal that allows pragmas in other locations risk the charge > of ... going outside its remit to allow things that are not part of > the requirements and go well beyond any known use cases. > > In SM, by contrast to TM, pragmas can be located pretty much > anywhere, which means the grammar writer will need a much better > grasp of where 's' is used in the ixml grammar for ixml than I > suspect most people even in the CG will have. Given a rule like > > a: @b, ^c, -d. > > it is not hard to see (or at least imagine) that comments and > whitespace can occur in the locations where comments occur below: > > {1}a{2}: {3}@{4}b{5}, {6}^{7}c{8}, {9}-{10}d{11}.{12} > > I suspect that I am not the only member of the CG who would have to > consult the grammar for ixml to know which element in the XML form > of this rule will be the parent of each comment. > > If I want a comment or SM pragma placed as a child of the > nonterminal c, which are my options? 6, 7, and 8, right? Wrong. > If I want a comment or SM pragma to appear as a child of the 'rule' > element, what are my options? > > From where I sit, the ixml grammar currently does a remarkably good > job of keeping rules visually simple by keeping the 's' nonterminal > out of the way; it does this in part by pushing the 's' as far down > in the parse trees as possible. But as we have seen with the rules > for class, for @from, and for @to, that sometimes ends up allowing > comments in places where we don't want them. As we saw some months > ago with the rule for ixml, it also sometimes ends up not allowing > comment elements in the XML form of the grammar in places where we > do want them. > > If we allow 's' to determine not just where whitespace and comments > can go but also where pragmas can go, I think the treatment of 's' > needs re-thinking from the ground up: we will be obligating > ourselves either to a long and very tedious process of examining > every occurrence of 's' in the grammar and thinking about where it > should attach in the parse tree, or to waving our hands and saying a > bit crossly "it doesn't matter!". But it does matter. > > I hope that explains why I am not yet persuaded that the TM proposal is > too complicated. > > -- > C. M. Sperberg-McQueen > Black Mesa Technologies LLC > http://blackmesatech.com >
Received on Wednesday, 19 January 2022 11:11:01 UTC