- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Tue, 18 Jan 2022 11:54:12 -0700
- To: public-ixml@w3.org
During the call earlier today, Steven Pemberton summarized his objections to the pragmas proposal put forward by Tom Hillman and me in three points, of which I unfortunately remember only two: it's too complicated and it goes beyond its remit. I'd like to address these two issues, if I can do so usefully. (1) On the issue of our remit, I think Tom and I have already answered the objection. We did not see a way to make pragmas work well without some mechanism for distributed naming, so we faced the choice between making a proposal for distributed naming part of the pragmas proposal or not making a pragmas proposal. Those who believe distributed naming is not necessary in order to satisfy the requirements Tom and I identified are welcome to make their case that it's not necessary, or that our requirements are too stringent. But so far I have not seen anyone making either of those cases. The proposal we made for distributed naming was to reuse the QName mechanism now familiar to pretty much anyone who uses XML seriously, including people who avoid namespaces wherever possible. In principle, as we said last week, any other mechanism would do as well. Having thought about it in the last week, I think SP's strawman proposal persuaded me that QNames are really the only plausible solution in an XML context, because they are familiar and well understood. Any other mechanism will elicit the question from users "why didn't you just use QNames?" I understand that not everyone in the group thought, when Tom and I took an action to develop a pragmas proposal, that it would also entail a proposal for namespaces or something like them. But our remit was to produce a workable proposal for pragmas; I think that any workable pragmas proposal requires a workable proposal for QNames. As I say, anyone in or outside the group is welcome to explain why it doesn't. (2) On the issue of complication, I would most of all like a bit more specificity. It's hard to answer so vague and sweeping an objection, and I am reduced to guessing which parts of the proposal people think are too complicated. Judging SP's strawman proposal as a baseline level of complexity (and using the names Tinman (TM) and Strawman (SM) for brevity, I think I see some areas in which SM is simpler than TM, some in which it's more complex, and a number of areas where the changes don't seem to make any significant difference. - Several ways in which SM differs from TM appear to me irrelevant to questions of complexity -- that is, they neither make things simpler nor make them more complicated. Among these I would list: . The change in delimiters. . The prohibition on empty comments. . Allowing empty processor specifications. . Forbidding blanks but not newline, tab, or other whitespace characters within the processor specification. . Requiring blank and not allowing newline, tab, or any other other whitespace to separate the processor specification from the pragma body. . Defining the XML form of pragmas as having mixed content rather than element content. (These are all things I think of as weaknesses in SM, but none seems to be intended as a simplification.) - In SM, pragmas have a slightly more complicated internal structure than in TM, since processors are required to recognize comments and pragmas embedded in pragmas. (I think, by the way, that this is a design error and contradicts the principle that "The structure of body text of any pragma is defined by the processor it is addressed to." A better design allows the pragma delimiters to occur within a pragma, but without requiring that when they are encountered they define a syntactically legal pragma. And ditto for comments. But that is not directly relevant to the question of complexity.) - In SM, pragmas are allowed wherever whitespace and comments are allowed, which reduces complexity as measured by the number of changes to the grammar. I wonder if the other changes to the grammar in TM are what SP has in mind when he says it's "too complicated"; I suspect it is. On the other hand, as far as I can tell SM is more complex to use for the grammar writer, especially but not exclusively the grammar writer who cares about the XML form of the grammar. The reason for TM's design in this area is that in every use case anyone has reported for pragmas, the pragma can be understood as an annotation on a symbol in the right-hand side of a rule, on a rule, or on the grammar itself. There may be other use cases which have different requirements, but so far no one has mentioned any. So TM reflects an attempt to make the syntax of pragmas suitable in those three cases. The examples of XQuery and ixml itself illustrate that quite often an intuitive syntax for annotating any thing puts the annotation before the thing. In ixml, to annotate a nonterminal with a mark, we write the mark before the nonterminal; the syntax of annotations and pragmas in XQuery similarly puts the annotation or the pragma first. Any discussion of attribute grammars will tend to illustrate an opposite tendency: the attribute value assignment rules for any grammar production can be viewed as annotations on the rule, but invariably follow the rule rather than preceding it. TM allows annotations on a symbol to occur before it, before or after any mark. So for a rule of the form a : ¿my:red? @b, ¿my:orange? ^c, ¿my:yellow? -d. or equivalently a : @ ¿my:red? b, ^ ¿my:orange? c, - ¿my:yellow? d. the XML form places the pragmas named my:* as children of the nonterminal elements: <rule name="a"> <alt> <nonterminal mark="@" name="b"> <pragma pname="my:red"/> </nonterminal> <nonterminal mark="^" name="c"> <pragma pname="my:orange"/> </nonterminal> <nonterminal mark="-" name="d"> <pragma pname="my:yellow"/> </nonterminal> </alt> </rule> Other parts of the TM proposal allow pragmas in locations where the XML form of the grammar will place the pragma as a child of the element representing the thing it annotates (rule or grammar). If there is a use case that requires that pragmas be able to occur as children of other elements, we need to capture it. Otherwise, any proposal that allows pragmas in other locations risk the charge of ... going outside its remit to allow things that are not part of the requirements and go well beyond any known use cases. In SM, by contrast to TM, pragmas can be located pretty much anywhere, which means the grammar writer will need a much better grasp of where 's' is used in the ixml grammar for ixml than I suspect most people even in the CG will have. Given a rule like a: @b, ^c, -d. it is not hard to see (or at least imagine) that comments and whitespace can occur in the locations where comments occur below: {1}a{2}: {3}@{4}b{5}, {6}^{7}c{8}, {9}-{10}d{11}.{12} I suspect that I am not the only member of the CG who would have to consult the grammar for ixml to know which element in the XML form of this rule will be the parent of each comment. If I want a comment or SM pragma placed as a child of the nonterminal c, which are my options? 6, 7, and 8, right? Wrong. If I want a comment or SM pragma to appear as a child of the 'rule' element, what are my options? From where I sit, the ixml grammar currently does a remarkably good job of keeping rules visually simple by keeping the 's' nonterminal out of the way; it does this in part by pushing the 's' as far down in the parse trees as possible. But as we have seen with the rules for class, for @from, and for @to, that sometimes ends up allowing comments in places where we don't want them. As we saw some months ago with the rule for ixml, it also sometimes ends up not allowing comment elements in the XML form of the grammar in places where we do want them. If we allow 's' to determine not just where whitespace and comments can go but also where pragmas can go, I think the treatment of 's' needs re-thinking from the ground up: we will be obligating ourselves either to a long and very tedious process of examining every occurrence of 's' in the grammar and thinking about where it should attach in the parse tree, or to waving our hands and saying a bit crossly "it doesn't matter!". But it does matter. I hope that explains why I am not yet persuaded that the TM proposal is too complicated. -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Tuesday, 18 January 2022 18:54:33 UTC