- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Wed, 02 Feb 2022 12:18:20 -0700
- To: public-ixml@w3.org
During the group's discussions of pragmas, I confess that I have had
some trouble understanding the arguments people are trying to bring to
bear on the issues. Sometimes, I have been told, the best way to
understand the arguments on the other side of an issue better is to
appoint oneself an advocate for the other side and formulate those
arguments oneself. So I am going to try that approach.
Of course, any provision for pragmas involves a number of different
questions, on each of which different arguments may bear.
What follows is an attempt to identify a number of distinct questions
relating to the inclusion of pragmas in our spec and for each question
to identify and summarize in a neutral way the arguments on each side
of the question. Since an important goal of this exercise is to
understand the arguments on each side of each question, I hope that
members of the group will identify places where the lists of arguments
are incomplete or misleading or badly phrased.
I should define some terminology:
- By 'out-of-band information' I mean information that does not
affect the standard interpretation of the relevant ixml grammar.
In the nature of things, many plausible examples will involve
out-of-band information that cannot be expressed using our
grammatical formalism, which is at least approximately the same as
information for which the ixml spec provides no standardized
representation.
Examples already brought forward include requests not to mark
sentences as ambiguous or to optimize the parsing process in some
way that does not affect the parsing results but may affect
resource requirements. But also in the nature of things, any
mechanism we provide for out-of-band information can be used by a
user to include information that could have been expressed
directly in the ixml grammar but was not. And going even further,
such a mechanism can easily be used to include information that
*is* expressed in the grammar; that would be redundant, but very
hard to prohibit.
So any characterization, later on, of information as
'extra-grammatical' or 'non-standardized' should be taken with a
grain of salt as a coarse approximation.
Also in the nature of things, the goals implementors and users
might seek to achieve by the use of out-of-band information are
difficult to bound. Obvious possibilities include varying this or
that aspect of a processor's operation, overriding defaults, or
specifying behavior that is not the usual behavior for the
processor.
- By 'inline' information I mean information conveyed within the
sequence of characters submitted to an ixml processor as the ixml
input grammar.
Other information available to a processor includes information
conveyed in the input string and information conveyed by
invocation-time parameters or options.
- By information 'directed to a processor' I mean information that a
processor can successfully parse and process. The term is not
intended to suggest and should not be taken to mean that the
information cannot also be read, understood, and acted on by a
human being.
- The 'SM' proposal is the 'straw man' proposal on pragmas put
forward by Steven Pemberton in response to the TM proposal.
- The 'TM' proposal is that put forward by Tomos Hillman and me.
- A 'pragma' is inline out-of-band information directed to a
processor.
Question 1: should ixml provide for, or allow for, out-of-band
information?
Note: I do not believe there is currently disagreement on this
point.
Arguments con: If all out-of-band information were forbidden it
would be easier to guarantee interoperability.
Arguments pro: It would be dauntingly difficult to forbid it.
Unless we prohibit the definition of invocation options for a
processor, the processor will have access to out-of-band
information. And even then, it would be hard to guarantee that
processors have no access to out-of-band information.
Question 2: should ixml provide for, or allow for, inline out-of-band
information?
Note: I do not believe there is currently disagreement on this
point.
Arguments con: If all out-of-band information were forbidden it
would be easier to guarantee interoperability.
Arguments pro: It would be dauntingly difficult to forbid it.
Unless we rewrite the grammar of ixml to prohibit comments, the
processor will have access to inline out-of-band information. And
if we did rewrite the grammar of ixml to prohibit comments,
steganographic techniques could be used to embed inline
out-of-band information in an ixml input grammar (using the number
and identity of whitespace characters to encode information, for
example).
Question 3: should ixml provide for, or allow for, inline out-of-band
information directed to processors ?
Note: I do not believe there is currently disagreement on this
point.
Arguments con: If all out-of-band information were forbidden it
would be easier to guarantee interoperability.
Arguments pro: It would be dauntingly difficult to forbid it.
Unless we rewrite the grammar of ixml to prohibit comments, users
can use comments to embed out-of-band information directed to a
processor in an ixml input grammar. If we do forbid comments, users
can use other means to do so.
Question 4: should ixml provide distinct constructs for inline
out-of-band information directed to processors and other inline
out-of-band information?
Note: It is not clear whether there is currently disagreement on
this point.
Both the TM proposal and the SM proposal provide distinct
constructs, with nonterminals named 'pragma' and 'comment'. But it
is not clear that those proposal represent the full range of current
opinion in the group.
The natural interpretation of each proposal is that the 'pragma'
construct is intended (as the name suggests) for inline
out-of-band information directed to a processor, and the 'comment'
construct is intended for other inline out-of-band information.
In the nature of things, there is nothing that can prevent a user
from using either construct in the 'other' way. The implicit
assumption appears to be that the interoperability advantages of
writing pragmas using the 'pragma' nonterminal and the relative
pointlessness of writing other inline out-of-band information that
way will suffice to ensure that at a first approximation pragmas are
written using the 'pragma' construct and other inline out-of-band
information using the 'comment' construct. As far as I know, no
relevant arguments on any side of any question rely on the
correlation being perfect.
Arguments pro:
. Providing distinct constructs allows those who believe pragmas
and comments are usefully distinguished to do so and thus makes
ixml grammars clearer and easier to understand.
. Providing distinct constructs reduces the probability that a
comment intended only as a human-readable observation (e.g. the
comment {!} to mark a grammatical rule that the author thinks the
reader might not have expected to see) is misinterpreted by a
processor as a pragma (e.g. as a request to behave in a particular
way); it thus improves the likelihood that a grammar that works
successfully with one processor will also work successfully with
others.
Arguments contra:
. Providing a defined construct for implementation-defined
behavior, or even for behavior to be defined by some future
version of a specification, serves as a signal to implementors
that it is acceptable and perhaps even expected that they may or
should provide non-standard behaviors that can be invoked that
way. It thus reduces the likelihood that a grammar that works
successfully with one processor will produce the same results with
others.
. Providing a defined construct for implementation-defined
behavior allows (and may encourage) implementations to use that
construct to specify new or extended behavior, even in cases where
the behavior should (on the grounds of technical soundness and
quality of design) be provided by different syntax and be built
into the base specification.
If (for example) a programming language provides no type system
but does provide a pragma construct, it would be a design error
for compilers to provide type checking by means of pragmas: the
type system should be built into the language, not added by means
of pragmas.
If (for example) a styling provides no styling property to specify
a particular kind of text rendering, but does provide for
implementation-defined styling properties, it would be a design
error for renderers to provide control over that property by using
an implementation-defined property: the property should be built
into the language, not added by means of implementation-defined
properties.
Question 5: If ixml provides distinct nonterminal for 'pragma' and
'comment' (for pragmas and other inline out-of-band information,
respectively), how should the two nonterminals be related,
grammatically?
(a) The set of strings generated by 'pragma' should be a subset of
those generated by 'comment'.
This makes the statement 'pragmas are comments' true by
grammatical construction.
(b) The set of strings generated by 'pragma' and the set generated
by 'comment' should be disjoint, but the delimiters should be
chosen so as to be visually similar and convey an underlying
affinity between the two constructs.
This makes the statement 'pragmas are comments' not true as
regards the grammatical constructs, but apposite as a metaphor.
(c) The set of strings generated by 'pragma' and the set generated
by 'comment' should be disjoint, and the delimiters should be
distinct, so as to convey that the two constructs are distinct.
Note: the choice among (a), (b), and (c) appears to depend in part
on whether 'comment' is taken as a name for 'inline out-of-band
information' or for 'inline out-of-band information other than
pragmas'.
Note: It is not clear whether there is currently disagreement on
this point.
The SM proposal follows path (b).
The TM proposal uses distinct delimiters for comments and pragmas,
but the authors of the proposal have indicated that they would be
willing to accept the delimiter pairs '{[' ... ']}' or '{|'
... '|}' and '⦃' ... '⦄' for pragmas, while retaining '{' ... '}'
for comments (assuming appropriate grammatical adjustments to
prevent ambiguity).
Arguments for (a)
- If by nature pragmas are comments, then the grammar should
reflect that fact.
Arguments for (c)
- If by nature pragmas and comments are distinct objects, then the
grammar should reflect that fact.
Arguments for (b)
- If some members of the group feel strongly that by nature
pragmas are comments, while others feel that pragmas and
comments are distinct objects and neither is a subset of the
other, then the grammar cannot fully satisfy both views. But if
the similarity of the two constructs can be captured by a
similarity of delimiters, and the distinctness of the two
constructs can be captured by making them distinct
grammatically, then holders of each view may find the spec
workable.
A number of other questions arise, relating to the internal syntax and
semantics of pragmas and relating to their place in the larger syntax
of ixml, but this mail is already long enough that I expect some
readers will be prepared to accuse the author over-thinking things.
So I will stop here. If we can understand where members of the group
are on the questions identified above, and the arguments that lead
them to their positions, I think it might conduce to progress.
So I repeat my request that members of the group help other members of
the group understand their positions by correcting the formulation of
what they recognize as their arguments, or by providing formulations
for arguments they believe are relevant but missing.
If I have distorted or omitted any argument any member of the group has
brought forward, you may reliably take it as an indication that you did
not make it clearly enough for me to understand and remember it; please
make it again!
(Note, however, that I have attempted to phrase arguments in a neutral
tone, so if the only acceptable formulation of your views begins with
"it is obvious that ...", you may be disappointed by my paraphrase. But
the point of the exercise is to formulate the arguments in a way that
lets people understand them even if they disagree with them; phrases
that rhetorically demand assent are counter-productive. Meta-arguments
of the form "X outweighs Y" are also unhelpful; if X and Y are the
relevant arguments, and X weighs for your position, then it is already
evident which argument you find weightier.)
I hope this helps.
Michael
--
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com
Received on Wednesday, 2 February 2022 19:18:44 UTC