What is a pragma?

Hello,

In the last meeting, it seemed like we were lacking a common
understanding of “what is a pragma?” and that having such an
understanding might help move the discussion forward.

Bethan and I have been tossing the question back and forth, cribbing
heavily from Tom and Micheal’s excellent work, trying to formulate
something that might give us a place to start.

We offer what follows in the hope that it will kick-start discussion
to help us move forward before next week’s meeting.

# What are pragmas?

Broadly speaking, pragmas are instructions embedded in source code,
which extend the structures available in the language of the source
code. They are generally used to modify the processor's behaviour in
some way, but do not normally change the semantics of the source code.
Pragmas are often implementation-specific, although a set of standard
pragmas is specified by some languages.

Since pragmas are an extension mechanism, they will be used for
anything that users and implementors can agree would be useful
behavior (whether it’s conformant or not, if we’re being perfectly
honest). But if we don’t provide a mechanism, users and implementors
will find some other, unspecified and non-interoperable way to achieve
their ends.

Below are a selection of use cases, paraphrased from Tom and Michael's
“Pragmas for ixml”[1]. (We apologize in advance for any
misrepresentation of their use cases caused by clumsy paraphrasing.)

It isn’t necessary, or even likely, that we'll all agree that pragmas
are the right way to handle every one of these cases. But if we can
all see the utility of pragmas for *some* of these cases, even if we
don’t agree on which ones, then it makes sense to specify some sort of
structured pragma mechanism for ixml.

## Token annotation

Allows the author to indicate that a set of tokens, for example,
define a regular language, or can be safely recognized by a greedy
regular-expression match.

## Rule rewriting

Uses a pragma to specify that a rule as given is shorthand for another
set of rules which might be obtained by rewriting the rule as given.

## Alternative formulations

Provides for different formulations of rules. This might be used, for
example, to specify an alternative that is known to be optimized more
successfully on a particular processor.

[Note: we use the term “final serialization” in our paraphrase of the
next three Tom/Michael examples to mean a downstream output, rather
than vxml. Otherwise, each of these cases has at least the potential
to alter the semantics of the ixml grammar.]

## Indirection and/or renaming

Authors may wish to specify that different names should be used in the
final serialization of a parse. Annotations might allow indirection
(specifying that the name comes from some other nonterminal) or simple
renaming.

## Text injection

Offers a mechanism for inserting text into the final serialization, in
attribute values or in text content.

## Adding namespaces

Some users will find benefit in having the output of an ixml processor
produce a final serialization that uses a default namespace, or
includes a specific set of namespace bindings. Annotations offer a
convenient mechanism for providing this information to the processor.

A few additional use cases for pragmas that occur to us:

## Manipulating processor warnings

Requests that warnings are output if certain conditions arise, even if
those conditions are not normally cause for a warning.

## Choosing an output parse

If more than one valid parse is found, uses a set of requirements to
select a preferred parse for the vxml output.

## Declaring character encoding

Declares the encoding of an ixml input string, and raises an error if
the string does not match the declaration.

## Requesting a downstream serialization

Tells the processor to serialize its vxml output in the requested
format (for example, JSON, or a CSV file).

## Input string rejection

Rejects an input string, before attempting to parse it, if it fails to
meet stated requirements (e.g. rejects strings that are too long, or
strings composed of characters other than [a-zA-Z] - easy and cheap to
do this before wasting resources on a full parse of a string that the
grammar author knows cannot possibly be recognized).

All these use cases—ours and Tom/Michael’s—share the features of
being extension directives, whose functionality is not a part of the
core ixml syntax, and which modify the processor’s behaviour in some
way.

We take the position that conformant interpretation of pragmas should
not alter the semantics of an ixml grammar. (You can’t have a
conformant pragma that changes the rules!) We think Tom and Michael
might disagree and look forward to their spirited input :-) And, of
course, if there are comments, critiques, or counter-proposals, please
send them to the group!

[1] https://github.com/invisibleXML/ixml/blob/proposal-pragmas/misc/pragmas.md

                                        Be seeing you,
                                          norm (& bethan)

--
Norman Tovey-Walsh <ndw@nwalsh.com>
https://nwalsh.com/

> No matter how cynical I get, I find I just can't keep up.--Lily Tomlin

Received on Friday, 28 January 2022 19:19:09 UTC