An ixml grammar with all the bugs fixed

Hi folks,

I found myself getting a little impatient for a grammar that fixes the
open bugs and accepted enhancements, so I took it upon myself to craft
one. I thought it might be interesting to others, so here it is. The
full grammar is attached, after an annotated diff from the version in
the 2022-03-29 spec.

1c1,7
<          ixml: s, rule+S, s.
---
>          ixml: prolog, s, rule++WS, s.

I’ve renamed “S” to “WS” (issue #62) and implemented “**” for repeat0 and
“++” for repeat1 (issue #59).

>        prolog: s, (version, s)?, nsdecl**s .
>       version: -"ixml", WS, -"version", WS, -string, s, -'.' .
>        nsdecl: -"namespace", WS, prefix, s, -["=:"], s, uri, -"." .
>       @prefix: -ncname.
>          @uri: -string.

I’ve speculatively included both the version (issue #63) and namespace
(issue #66) proposals, partly to see if they add a lot of complexity.
They don’t really.

In practice, I’ve implemented something slightly different from the
current namespaces proposal. I’ve used “uri” instead of “nshref” for the
namespace URI. I think the string “href” tends to give the impression
you can or should dereference it and that’s not usually the impression
one wants to give for namespace names.

4c10
<            -S: (whitespace; comment)+.
---
>           -WS: (whitespace; comment)+.

Issue #62.

14,15c20,21
<          alts: alt+(-[";|"], s).
<           alt: term*(-",", s).
---
>          alts: alt++(-[";|"], s).
>           alt: term**(-",", s).
23,24c29,32
<       repeat0: factor, -"*", s, sep?.
<       repeat1: factor, -"+", s, sep?.
---
>       repeat0: factor, -"*", s;
>                factor, -"**", sep, s.
>       repeat1: factor, -"+", s;
>                factor, -"++", sep.

Issue #59

29c37,38
<         @name: namestart, namefollower*.
---
>         @name: (ncname, ':')?, ncname.
>       -ncname: namestart, namefollower*.

This change is for the namespaces proposal. I’ve gone down a slightly
different route, but I think the result is the same, so it’s just a
matter of taste, really. I like the similarity with other specs that
comes from the use of an “ncname” nonterminal.

The namespaces proposal also suggests making a space after the “:” in a
rule mandatory. I don’t believe that’s necessary. There’s nothing
ambiguous about

  prefix:local-name:prefix:rhs-nonterminal.

I think it’d be visually improved by a space after the seond colon or by
using an “=” instead of a colon, but it’s fine as it is.

42c51
<         dchar: ~['"'; #a; #d];
---
>        -dchar: ~['"'; #a; #d];
44c53
<         schar: ~["'"; #a; #d];
---
>        -schar: ~["'"; #a; #d];
53c62

Issues #63 and #66 need an explicit “-” on the dchar and schar rules.

<          -set: -"[", s,  member*(-[";|"], s), -"]", s.
---
>          -set: -"[", s,  member**(s, -[";|"], s), s, -"]", s.

Fix #57.

                                        Be seeing you,
                                          norm
--
Norm Tovey-Walsh
Saxonica

Received on Sunday, 10 April 2022 08:22:25 UTC