Re: some questions about pragmas, and the arguments as I understand them from Bethan Tovey-Walsh on 2022-02-03 (public-ixml@w3.org from February 2022)

From: Bethan Tovey-Walsh <accounts@bethan.wales>
Date: Thu, 3 Feb 2022 09:24:29 +0000
To: Tom Hillman <tom@expertml.com>
Cc: public-ixml@w3.org, "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
Message-Id: <6284A171-0CA3-42EE-A684-0D143C4FDB39@bethan.wales>
Is it reasonable to posit a 5d. - that comments and pragmata are both subtypes of inline out-of-band information? This could lead to a representation in the grammar along these lines:

oob : “{“, S*, (comment; pragma), S*, “}”.
comment : whatever.
pragma : “[“, S*, pragma-name, pragma-data?, S*, “]”.

(Apologies for any syntactic errors - I’m writing on my phone, which is not ideal!)

I think I’d prefer this, conceptually, because it leaves aside the question of the relationship between pragmata and comments. But I could absolutely live with 5b. 

> On 3 Feb 2022, at 09:09, Tom Hillman <tom@expertml.com> wrote:
> 
> 
> Thanks so much for this, Michael.  I found it a taxing read, but an extremely rewarding one.
> 
> In particular, I can now see how, if a comment is “inline out of band information”, and a pragma is “inline out of band information for processors”, then pragma are, indeed, a subtype of comment, at least conceptually, and I can see how that is an argument for 5.a. or 5.b.
> 
> I think that there is a missing argument for 5.b. and 5.c. which goes something like:
> Although pragma may by nature be a sub-type of comment, for pragmatic reasons they need to be unambiguously grammatically distinguishable from other types of comments, as by definition the processor needs to be able to recognise them to receive the information.
> 
> (or words to that effect).
> 
> Again, I think this is a really helpful exercise for me, I hope we can continue with it both here and for other areas of disagreement.  I’d encourage other members of the committee to take the time to give it a careful reading.
> 
> Tom
> 
> _________________
> Tomos Hillman
> eXpertML Ltd
> +44 7793 242058
>> On 2 Feb 2022, 19:18 +0000, C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>, wrote:
>> During the group's discussions of pragmas, I confess that I have had
>> some trouble understanding the arguments people are trying to bring to
>> bear on the issues. Sometimes, I have been told, the best way to
>> understand the arguments on the other side of an issue better is to
>> appoint oneself an advocate for the other side and formulate those
>> arguments oneself. So I am going to try that approach.
>> 
>> Of course, any provision for pragmas involves a number of different
>> questions, on each of which different arguments may bear.
>> 
>> What follows is an attempt to identify a number of distinct questions
>> relating to the inclusion of pragmas in our spec and for each question
>> to identify and summarize in a neutral way the arguments on each side
>> of the question. Since an important goal of this exercise is to
>> understand the arguments on each side of each question, I hope that
>> members of the group will identify places where the lists of arguments
>> are incomplete or misleading or badly phrased.
>> 
>> I should define some terminology:
>> 
>> - By 'out-of-band information' I mean information that does not
>> affect the standard interpretation of the relevant ixml grammar.
>> 
>> In the nature of things, many plausible examples will involve
>> out-of-band information that cannot be expressed using our
>> grammatical formalism, which is at least approximately the same as
>> information for which the ixml spec provides no standardized
>> representation.
>> 
>> Examples already brought forward include requests not to mark
>> sentences as ambiguous or to optimize the parsing process in some
>> way that does not affect the parsing results but may affect
>> resource requirements. But also in the nature of things, any
>> mechanism we provide for out-of-band information can be used by a
>> user to include information that could have been expressed
>> directly in the ixml grammar but was not. And going even further,
>> such a mechanism can easily be used to include information that
>> *is* expressed in the grammar; that would be redundant, but very
>> hard to prohibit.
>> 
>> So any characterization, later on, of information as
>> 'extra-grammatical' or 'non-standardized' should be taken with a
>> grain of salt as a coarse approximation.
>> 
>> Also in the nature of things, the goals implementors and users
>> might seek to achieve by the use of out-of-band information are
>> difficult to bound. Obvious possibilities include varying this or
>> that aspect of a processor's operation, overriding defaults, or
>> specifying behavior that is not the usual behavior for the
>> processor.
>> 
>> - By 'inline' information I mean information conveyed within the
>> sequence of characters submitted to an ixml processor as the ixml
>> input grammar.
>> 
>> Other information available to a processor includes information
>> conveyed in the input string and information conveyed by
>> invocation-time parameters or options.
>> 
>> - By information 'directed to a processor' I mean information that a
>> processor can successfully parse and process. The term is not
>> intended to suggest and should not be taken to mean that the
>> information cannot also be read, understood, and acted on by a
>> human being.
>> 
>> - The 'SM' proposal is the 'straw man' proposal on pragmas put
>> forward by Steven Pemberton in response to the TM proposal.
>> 
>> - The 'TM' proposal is that put forward by Tomos Hillman and me.
>> 
>> - A 'pragma' is inline out-of-band information directed to a
>> processor.
>> 
>> Question 1: should ixml provide for, or allow for, out-of-band
>> information?
>> 
>> Note: I do not believe there is currently disagreement on this
>> point.
>> 
>> Arguments con: If all out-of-band information were forbidden it
>> would be easier to guarantee interoperability.
>> 
>> Arguments pro: It would be dauntingly difficult to forbid it.
>> 
>> Unless we prohibit the definition of invocation options for a
>> processor, the processor will have access to out-of-band
>> information. And even then, it would be hard to guarantee that
>> processors have no access to out-of-band information.
>> 
>> Question 2: should ixml provide for, or allow for, inline out-of-band
>> information?
>> 
>> Note: I do not believe there is currently disagreement on this
>> point.
>> 
>> Arguments con: If all out-of-band information were forbidden it
>> would be easier to guarantee interoperability.
>> 
>> Arguments pro: It would be dauntingly difficult to forbid it.
>> 
>> Unless we rewrite the grammar of ixml to prohibit comments, the
>> processor will have access to inline out-of-band information. And
>> if we did rewrite the grammar of ixml to prohibit comments,
>> steganographic techniques could be used to embed inline
>> out-of-band information in an ixml input grammar (using the number
>> and identity of whitespace characters to encode information, for
>> example).
>> 
>> Question 3: should ixml provide for, or allow for, inline out-of-band
>> information directed to processors ?
>> 
>> Note: I do not believe there is currently disagreement on this
>> point.
>> 
>> Arguments con: If all out-of-band information were forbidden it
>> would be easier to guarantee interoperability.
>> 
>> Arguments pro: It would be dauntingly difficult to forbid it.
>> 
>> Unless we rewrite the grammar of ixml to prohibit comments, users
>> can use comments to embed out-of-band information directed to a
>> processor in an ixml input grammar. If we do forbid comments, users
>> can use other means to do so.
>> 
>> Question 4: should ixml provide distinct constructs for inline
>> out-of-band information directed to processors and other inline
>> out-of-band information?
>> 
>> Note: It is not clear whether there is currently disagreement on
>> this point.
>> 
>> Both the TM proposal and the SM proposal provide distinct
>> constructs, with nonterminals named 'pragma' and 'comment'. But it
>> is not clear that those proposal represent the full range of current
>> opinion in the group.
>> 
>> The natural interpretation of each proposal is that the 'pragma'
>> construct is intended (as the name suggests) for inline
>> out-of-band information directed to a processor, and the 'comment'
>> construct is intended for other inline out-of-band information.
>> 
>> In the nature of things, there is nothing that can prevent a user
>> from using either construct in the 'other' way. The implicit
>> assumption appears to be that the interoperability advantages of
>> writing pragmas using the 'pragma' nonterminal and the relative
>> pointlessness of writing other inline out-of-band information that
>> way will suffice to ensure that at a first approximation pragmas are
>> written using the 'pragma' construct and other inline out-of-band
>> information using the 'comment' construct. As far as I know, no
>> relevant arguments on any side of any question rely on the
>> correlation being perfect.
>> 
>> Arguments pro:
>> 
>> . Providing distinct constructs allows those who believe pragmas
>> and comments are usefully distinguished to do so and thus makes
>> ixml grammars clearer and easier to understand.
>> 
>> . Providing distinct constructs reduces the probability that a
>> comment intended only as a human-readable observation (e.g. the
>> comment {!} to mark a grammatical rule that the author thinks the
>> reader might not have expected to see) is misinterpreted by a
>> processor as a pragma (e.g. as a request to behave in a particular
>> way); it thus improves the likelihood that a grammar that works
>> successfully with one processor will also work successfully with
>> others.
>> 
>> Arguments contra:
>> 
>> . Providing a defined construct for implementation-defined
>> behavior, or even for behavior to be defined by some future
>> version of a specification, serves as a signal to implementors
>> that it is acceptable and perhaps even expected that they may or
>> should provide non-standard behaviors that can be invoked that
>> way. It thus reduces the likelihood that a grammar that works
>> successfully with one processor will produce the same results with
>> others.
>> 
>> . Providing a defined construct for implementation-defined
>> behavior allows (and may encourage) implementations to use that
>> construct to specify new or extended behavior, even in cases where
>> the behavior should (on the grounds of technical soundness and
>> quality of design) be provided by different syntax and be built
>> into the base specification.
>> 
>> If (for example) a programming language provides no type system
>> but does provide a pragma construct, it would be a design error
>> for compilers to provide type checking by means of pragmas: the
>> type system should be built into the language, not added by means
>> of pragmas.
>> 
>> If (for example) a styling provides no styling property to specify
>> a particular kind of text rendering, but does provide for
>> implementation-defined styling properties, it would be a design
>> error for renderers to provide control over that property by using
>> an implementation-defined property: the property should be built
>> into the language, not added by means of implementation-defined
>> properties.
>> 
>> Question 5: If ixml provides distinct nonterminal for 'pragma' and
>> 'comment' (for pragmas and other inline out-of-band information,
>> respectively), how should the two nonterminals be related,
>> grammatically?
>> 
>> (a) The set of strings generated by 'pragma' should be a subset of
>> those generated by 'comment'.
>> 
>> This makes the statement 'pragmas are comments' true by
>> grammatical construction.
>> 
>> (b) The set of strings generated by 'pragma' and the set generated
>> by 'comment' should be disjoint, but the delimiters should be
>> chosen so as to be visually similar and convey an underlying
>> affinity between the two constructs.
>> 
>> This makes the statement 'pragmas are comments' not true as
>> regards the grammatical constructs, but apposite as a metaphor.
>> 
>> (c) The set of strings generated by 'pragma' and the set generated
>> by 'comment' should be disjoint, and the delimiters should be
>> distinct, so as to convey that the two constructs are distinct.
>> 
>> 
>> Note: the choice among (a), (b), and (c) appears to depend in part
>> on whether 'comment' is taken as a name for 'inline out-of-band
>> information' or for 'inline out-of-band information other than
>> pragmas'.
>> 
>> 
>> Note: It is not clear whether there is currently disagreement on
>> this point.
>> 
>> The SM proposal follows path (b).
>> 
>> The TM proposal uses distinct delimiters for comments and pragmas,
>> but the authors of the proposal have indicated that they would be
>> willing to accept the delimiter pairs '{[' ... ']}' or '{|'
>> ... '|}' and '⦃' ... '⦄' for pragmas, while retaining '{' ... '}'
>> for comments (assuming appropriate grammatical adjustments to
>> prevent ambiguity).
>> 
>> 
>> Arguments for (a)
>> 
>> - If by nature pragmas are comments, then the grammar should
>> reflect that fact.
>> 
>> Arguments for (c)
>> 
>> - If by nature pragmas and comments are distinct objects, then the
>> grammar should reflect that fact.
>> 
>> Arguments for (b)
>> 
>> - If some members of the group feel strongly that by nature
>> pragmas are comments, while others feel that pragmas and
>> comments are distinct objects and neither is a subset of the
>> other, then the grammar cannot fully satisfy both views. But if
>> the similarity of the two constructs can be captured by a
>> similarity of delimiters, and the distinctness of the two
>> constructs can be captured by making them distinct
>> grammatically, then holders of each view may find the spec
>> workable.
>> 
>> 
>> A number of other questions arise, relating to the internal syntax and
>> semantics of pragmas and relating to their place in the larger syntax
>> of ixml, but this mail is already long enough that I expect some
>> readers will be prepared to accuse the author over-thinking things.
>> So I will stop here. If we can understand where members of the group
>> are on the questions identified above, and the arguments that lead
>> them to their positions, I think it might conduce to progress.
>> 
>> So I repeat my request that members of the group help other members of
>> the group understand their positions by correcting the formulation of
>> what they recognize as their arguments, or by providing formulations
>> for arguments they believe are relevant but missing.
>> 
>> If I have distorted or omitted any argument any member of the group has
>> brought forward, you may reliably take it as an indication that you did
>> not make it clearly enough for me to understand and remember it; please
>> make it again!
>> 
>> (Note, however, that I have attempted to phrase arguments in a neutral
>> tone, so if the only acceptable formulation of your views begins with
>> "it is obvious that ...", you may be disappointed by my paraphrase. But
>> the point of the exercise is to formulate the arguments in a way that
>> lets people understand them even if they disagree with them; phrases
>> that rhetorically demand assent are counter-productive. Meta-arguments
>> of the form "X outweighs Y" are also unhelpful; if X and Y are the
>> relevant arguments, and X weighs for your position, then it is already
>> evident which argument you find weightier.)
>> 
>> I hope this helps.
>> 
>> Michael
>> 
>> --
>> C. M. Sperberg-McQueen
>> Black Mesa Technologies LLC
>> http://blackmesatech.com
>>
Received on Thursday, 3 February 2022 09:24:49 UTC