- From: Norm Tovey-Walsh <ndw@nwalsh.com>
- Date: Fri, 28 Jan 2022 19:18:00 +0000
- To: ixml <public-ixml@w3.org>
- Message-ID: <m2fsp7zn1g.fsf@nwalsh.com>
Hello, In the last meeting, it seemed like we were lacking a common understanding of “what is a pragma?” and that having such an understanding might help move the discussion forward. Bethan and I have been tossing the question back and forth, cribbing heavily from Tom and Micheal’s excellent work, trying to formulate something that might give us a place to start. We offer what follows in the hope that it will kick-start discussion to help us move forward before next week’s meeting. # What are pragmas? Broadly speaking, pragmas are instructions embedded in source code, which extend the structures available in the language of the source code. They are generally used to modify the processor's behaviour in some way, but do not normally change the semantics of the source code. Pragmas are often implementation-specific, although a set of standard pragmas is specified by some languages. Since pragmas are an extension mechanism, they will be used for anything that users and implementors can agree would be useful behavior (whether it’s conformant or not, if we’re being perfectly honest). But if we don’t provide a mechanism, users and implementors will find some other, unspecified and non-interoperable way to achieve their ends. Below are a selection of use cases, paraphrased from Tom and Michael's “Pragmas for ixml”[1]. (We apologize in advance for any misrepresentation of their use cases caused by clumsy paraphrasing.) It isn’t necessary, or even likely, that we'll all agree that pragmas are the right way to handle every one of these cases. But if we can all see the utility of pragmas for *some* of these cases, even if we don’t agree on which ones, then it makes sense to specify some sort of structured pragma mechanism for ixml. ## Token annotation Allows the author to indicate that a set of tokens, for example, define a regular language, or can be safely recognized by a greedy regular-expression match. ## Rule rewriting Uses a pragma to specify that a rule as given is shorthand for another set of rules which might be obtained by rewriting the rule as given. ## Alternative formulations Provides for different formulations of rules. This might be used, for example, to specify an alternative that is known to be optimized more successfully on a particular processor. [Note: we use the term “final serialization” in our paraphrase of the next three Tom/Michael examples to mean a downstream output, rather than vxml. Otherwise, each of these cases has at least the potential to alter the semantics of the ixml grammar.] ## Indirection and/or renaming Authors may wish to specify that different names should be used in the final serialization of a parse. Annotations might allow indirection (specifying that the name comes from some other nonterminal) or simple renaming. ## Text injection Offers a mechanism for inserting text into the final serialization, in attribute values or in text content. ## Adding namespaces Some users will find benefit in having the output of an ixml processor produce a final serialization that uses a default namespace, or includes a specific set of namespace bindings. Annotations offer a convenient mechanism for providing this information to the processor. A few additional use cases for pragmas that occur to us: ## Manipulating processor warnings Requests that warnings are output if certain conditions arise, even if those conditions are not normally cause for a warning. ## Choosing an output parse If more than one valid parse is found, uses a set of requirements to select a preferred parse for the vxml output. ## Declaring character encoding Declares the encoding of an ixml input string, and raises an error if the string does not match the declaration. ## Requesting a downstream serialization Tells the processor to serialize its vxml output in the requested format (for example, JSON, or a CSV file). ## Input string rejection Rejects an input string, before attempting to parse it, if it fails to meet stated requirements (e.g. rejects strings that are too long, or strings composed of characters other than [a-zA-Z] - easy and cheap to do this before wasting resources on a full parse of a string that the grammar author knows cannot possibly be recognized). All these use cases—ours and Tom/Michael’s—share the features of being extension directives, whose functionality is not a part of the core ixml syntax, and which modify the processor’s behaviour in some way. We take the position that conformant interpretation of pragmas should not alter the semantics of an ixml grammar. (You can’t have a conformant pragma that changes the rules!) We think Tom and Michael might disagree and look forward to their spirited input :-) And, of course, if there are comments, critiques, or counter-proposals, please send them to the group! [1] https://github.com/invisibleXML/ixml/blob/proposal-pragmas/misc/pragmas.md Be seeing you, norm (& bethan) -- Norman Tovey-Walsh <ndw@nwalsh.com> https://nwalsh.com/ > No matter how cynical I get, I find I just can't keep up.--Lily Tomlin
Received on Friday, 28 January 2022 19:19:09 UTC