- From: Graydon <graydonish@gmail.com>
- Date: Wed, 5 Feb 2025 11:35:42 -0500
- To: Bethan Tovey-Walsh <bytheway@linguacelta.com>
- Cc: ixml <public-ixml@w3.org>
It is rash of me to wade in to this discussion, but let's see how far I get before the waters close over me. On Wed, Feb 05, 2025 at 02:03:31PM +0000, Bethan Tovey-Walsh scripsit: > 1. The "foundational rules", which were unanimously accepted as being > of primary importance by the CG - i.e. the first two requirements: > pragmas must not affect the syntactic validity, or change the > semantics, of the base grammar. I have formed a vague impression that one of the functional constraints on pragmas is a preference for whatever pragma syntax not making it difficult to parse an ixml grammar with ixml. From a practical perspective, pragmas as an idea are a way to provide information outside the scope of the language. In the ixml case, something that comes to mind immediately is serialization of the output. That isn't something that fits in the grammar but it is something a user would care about. So we have to be able to identify which pragma, and bind that pragma to a scope (I think, effectively "result", "non-terminal", "terminal"). It also needs to be possible for an implementer to do what they like; one might offer "you can have indented" for serialization and another might offer the full set of options for `xsl:result-document`. I'm inclined to think that the general trend of passing information around as maps is a good one, and that a pragma ought to be expressed as an XPath map which must have a `name` entry which must have a QName value and which must have a `scope` entry which must have a value from a set list (presumably 'result', 'non-terminal', 'terminal' or words to that effect) and which may have any other XPath map entry it wants. I'd further want to say that pragma definitions go at the top of the ixml grammar file and 'result' scope pragmas don't appear anywhere else. A non-terminal pragma reference appears to the left of the equal sign. A terminal pragma reference appears to the left of the full stop. (Should there be a space and nothing else between the pragma reference and the equals sign or stop? Probably.) Pragma references are by name, so pragma definitions are required to have unique names within a single grammar. What names have meaning is up to the implementer. Something vaguely like: #%# { 'name': 'serialize', 'scope': 'result', 'indent': 'yes' } #%# { 'name': 'normal-form', 'scope': 'non-terminal', 'form': 'NFKD' } #%# { 'name': 'trouble', 'scope': 'terminal' } ... fullQuote %#%normal-form = openQuote, quotedText, closeQuote. NL = #A %#%trouble . I don't know if a "done with pragmas" marker would be desirable; I am quite sure the #%# and %#% will not be anyone else's first thought. But I think the general pattern of "a pragma is defined by a map, the map has requirements, the pragmas are defined at the top of the file, sub-whole-result pragmas are used by reference" is both manageable for an implementer and retains a syntax where it is relatively straightforward to process an ixml grammar with ixml. (And can clearly be deleted from the grammar document without affecting its utility as an ixml grammar.) How you get paramaterized pragmas -- Sometimes I want to normalize the non-terminal as NFC, sometimes as NFKD -- would I think fit into the map construction, and could plausibly be left up to the implementer except perhaps for a decision about using either a lookup operator style %#%normal-form?NFC or parenthesis %#%normal-form(NFC) which would perhaps do a better job of supporting multiple params if that's a place people want to go. -- Graydon -- Graydon Saunders | graydonish@fastmail.com Þæs oferéode, ðisses swá mæg. -- Deor ("That passed, so may this.")
Received on Wednesday, 5 February 2025 16:35:49 UTC