- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Wed, 02 Feb 2022 12:18:20 -0700
- To: public-ixml@w3.org
During the group's discussions of pragmas, I confess that I have had some trouble understanding the arguments people are trying to bring to bear on the issues. Sometimes, I have been told, the best way to understand the arguments on the other side of an issue better is to appoint oneself an advocate for the other side and formulate those arguments oneself. So I am going to try that approach. Of course, any provision for pragmas involves a number of different questions, on each of which different arguments may bear. What follows is an attempt to identify a number of distinct questions relating to the inclusion of pragmas in our spec and for each question to identify and summarize in a neutral way the arguments on each side of the question. Since an important goal of this exercise is to understand the arguments on each side of each question, I hope that members of the group will identify places where the lists of arguments are incomplete or misleading or badly phrased. I should define some terminology: - By 'out-of-band information' I mean information that does not affect the standard interpretation of the relevant ixml grammar. In the nature of things, many plausible examples will involve out-of-band information that cannot be expressed using our grammatical formalism, which is at least approximately the same as information for which the ixml spec provides no standardized representation. Examples already brought forward include requests not to mark sentences as ambiguous or to optimize the parsing process in some way that does not affect the parsing results but may affect resource requirements. But also in the nature of things, any mechanism we provide for out-of-band information can be used by a user to include information that could have been expressed directly in the ixml grammar but was not. And going even further, such a mechanism can easily be used to include information that *is* expressed in the grammar; that would be redundant, but very hard to prohibit. So any characterization, later on, of information as 'extra-grammatical' or 'non-standardized' should be taken with a grain of salt as a coarse approximation. Also in the nature of things, the goals implementors and users might seek to achieve by the use of out-of-band information are difficult to bound. Obvious possibilities include varying this or that aspect of a processor's operation, overriding defaults, or specifying behavior that is not the usual behavior for the processor. - By 'inline' information I mean information conveyed within the sequence of characters submitted to an ixml processor as the ixml input grammar. Other information available to a processor includes information conveyed in the input string and information conveyed by invocation-time parameters or options. - By information 'directed to a processor' I mean information that a processor can successfully parse and process. The term is not intended to suggest and should not be taken to mean that the information cannot also be read, understood, and acted on by a human being. - The 'SM' proposal is the 'straw man' proposal on pragmas put forward by Steven Pemberton in response to the TM proposal. - The 'TM' proposal is that put forward by Tomos Hillman and me. - A 'pragma' is inline out-of-band information directed to a processor. Question 1: should ixml provide for, or allow for, out-of-band information? Note: I do not believe there is currently disagreement on this point. Arguments con: If all out-of-band information were forbidden it would be easier to guarantee interoperability. Arguments pro: It would be dauntingly difficult to forbid it. Unless we prohibit the definition of invocation options for a processor, the processor will have access to out-of-band information. And even then, it would be hard to guarantee that processors have no access to out-of-band information. Question 2: should ixml provide for, or allow for, inline out-of-band information? Note: I do not believe there is currently disagreement on this point. Arguments con: If all out-of-band information were forbidden it would be easier to guarantee interoperability. Arguments pro: It would be dauntingly difficult to forbid it. Unless we rewrite the grammar of ixml to prohibit comments, the processor will have access to inline out-of-band information. And if we did rewrite the grammar of ixml to prohibit comments, steganographic techniques could be used to embed inline out-of-band information in an ixml input grammar (using the number and identity of whitespace characters to encode information, for example). Question 3: should ixml provide for, or allow for, inline out-of-band information directed to processors ? Note: I do not believe there is currently disagreement on this point. Arguments con: If all out-of-band information were forbidden it would be easier to guarantee interoperability. Arguments pro: It would be dauntingly difficult to forbid it. Unless we rewrite the grammar of ixml to prohibit comments, users can use comments to embed out-of-band information directed to a processor in an ixml input grammar. If we do forbid comments, users can use other means to do so. Question 4: should ixml provide distinct constructs for inline out-of-band information directed to processors and other inline out-of-band information? Note: It is not clear whether there is currently disagreement on this point. Both the TM proposal and the SM proposal provide distinct constructs, with nonterminals named 'pragma' and 'comment'. But it is not clear that those proposal represent the full range of current opinion in the group. The natural interpretation of each proposal is that the 'pragma' construct is intended (as the name suggests) for inline out-of-band information directed to a processor, and the 'comment' construct is intended for other inline out-of-band information. In the nature of things, there is nothing that can prevent a user from using either construct in the 'other' way. The implicit assumption appears to be that the interoperability advantages of writing pragmas using the 'pragma' nonterminal and the relative pointlessness of writing other inline out-of-band information that way will suffice to ensure that at a first approximation pragmas are written using the 'pragma' construct and other inline out-of-band information using the 'comment' construct. As far as I know, no relevant arguments on any side of any question rely on the correlation being perfect. Arguments pro: . Providing distinct constructs allows those who believe pragmas and comments are usefully distinguished to do so and thus makes ixml grammars clearer and easier to understand. . Providing distinct constructs reduces the probability that a comment intended only as a human-readable observation (e.g. the comment {!} to mark a grammatical rule that the author thinks the reader might not have expected to see) is misinterpreted by a processor as a pragma (e.g. as a request to behave in a particular way); it thus improves the likelihood that a grammar that works successfully with one processor will also work successfully with others. Arguments contra: . Providing a defined construct for implementation-defined behavior, or even for behavior to be defined by some future version of a specification, serves as a signal to implementors that it is acceptable and perhaps even expected that they may or should provide non-standard behaviors that can be invoked that way. It thus reduces the likelihood that a grammar that works successfully with one processor will produce the same results with others. . Providing a defined construct for implementation-defined behavior allows (and may encourage) implementations to use that construct to specify new or extended behavior, even in cases where the behavior should (on the grounds of technical soundness and quality of design) be provided by different syntax and be built into the base specification. If (for example) a programming language provides no type system but does provide a pragma construct, it would be a design error for compilers to provide type checking by means of pragmas: the type system should be built into the language, not added by means of pragmas. If (for example) a styling provides no styling property to specify a particular kind of text rendering, but does provide for implementation-defined styling properties, it would be a design error for renderers to provide control over that property by using an implementation-defined property: the property should be built into the language, not added by means of implementation-defined properties. Question 5: If ixml provides distinct nonterminal for 'pragma' and 'comment' (for pragmas and other inline out-of-band information, respectively), how should the two nonterminals be related, grammatically? (a) The set of strings generated by 'pragma' should be a subset of those generated by 'comment'. This makes the statement 'pragmas are comments' true by grammatical construction. (b) The set of strings generated by 'pragma' and the set generated by 'comment' should be disjoint, but the delimiters should be chosen so as to be visually similar and convey an underlying affinity between the two constructs. This makes the statement 'pragmas are comments' not true as regards the grammatical constructs, but apposite as a metaphor. (c) The set of strings generated by 'pragma' and the set generated by 'comment' should be disjoint, and the delimiters should be distinct, so as to convey that the two constructs are distinct. Note: the choice among (a), (b), and (c) appears to depend in part on whether 'comment' is taken as a name for 'inline out-of-band information' or for 'inline out-of-band information other than pragmas'. Note: It is not clear whether there is currently disagreement on this point. The SM proposal follows path (b). The TM proposal uses distinct delimiters for comments and pragmas, but the authors of the proposal have indicated that they would be willing to accept the delimiter pairs '{[' ... ']}' or '{|' ... '|}' and '⦃' ... '⦄' for pragmas, while retaining '{' ... '}' for comments (assuming appropriate grammatical adjustments to prevent ambiguity). Arguments for (a) - If by nature pragmas are comments, then the grammar should reflect that fact. Arguments for (c) - If by nature pragmas and comments are distinct objects, then the grammar should reflect that fact. Arguments for (b) - If some members of the group feel strongly that by nature pragmas are comments, while others feel that pragmas and comments are distinct objects and neither is a subset of the other, then the grammar cannot fully satisfy both views. But if the similarity of the two constructs can be captured by a similarity of delimiters, and the distinctness of the two constructs can be captured by making them distinct grammatically, then holders of each view may find the spec workable. A number of other questions arise, relating to the internal syntax and semantics of pragmas and relating to their place in the larger syntax of ixml, but this mail is already long enough that I expect some readers will be prepared to accuse the author over-thinking things. So I will stop here. If we can understand where members of the group are on the questions identified above, and the arguments that lead them to their positions, I think it might conduce to progress. So I repeat my request that members of the group help other members of the group understand their positions by correcting the formulation of what they recognize as their arguments, or by providing formulations for arguments they believe are relevant but missing. If I have distorted or omitted any argument any member of the group has brought forward, you may reliably take it as an indication that you did not make it clearly enough for me to understand and remember it; please make it again! (Note, however, that I have attempted to phrase arguments in a neutral tone, so if the only acceptable formulation of your views begins with "it is obvious that ...", you may be disappointed by my paraphrase. But the point of the exercise is to formulate the arguments in a way that lets people understand them even if they disagree with them; phrases that rhetorically demand assent are counter-productive. Meta-arguments of the form "X outweighs Y" are also unhelpful; if X and Y are the relevant arguments, and X weighs for your position, then it is already evident which argument you find weightier.) I hope this helps. Michael -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Wednesday, 2 February 2022 19:18:44 UTC