An interesting attribute bug^H^H^Hfeature

Hello,

None (or very few) of the current syntax tests are “positive tests”.
Consequently, the recent grammar changes don’t show up in the test
suite, which is kind of bad.

I thought I’d write a few, and immediately encountered an interesting
bug. What makes it interesting, really, is that both Steven and I do the
same thing.

Here’s the grammar (simplified from a slightly larger one, hence “and
another”):

S: 'a' { and another } .

And here’s what both Steven and I produce:

<ixml>
   <rule name='S'>
      <alt>
         <literal string='a and another '/>
      </alt>
   </rule>
</ixml>

(This is a parse against the 2022-02-22 grammar; I’m not planning to
switch to the new grammar until the namefollower change is implemented
and bug #57 is fixed.)

I think what’s happening is the comment is getting caught up in the
“attributeness” of string:

     @string: -"'", schar+, -"'", s.

So, the questions are, is it a bug, and if it is, where is the bug?

Given:

test: @foo .
foo: 'foo', bar .
bar: 'bar' .

If we parse foobar, we expect

<test foo="foobar"/>

even though “bar” is marked as an element. What the spec says is:

  A nonterminal attribute is serialised by outputting the name of the
  node as an attribute, and serialising all non-hidden terminal
  descendants of the node (regardless of marking of intermediate
  nonterminals), in order, as the value of the attribute.

I started this message thinking this was an implementation bug, but I’ve
persuaded myself it’s a grammar bug. And an interesting one, too!

On casual inspection, this bug isn’t present in the 2022-03-17 grammar.
None of the @-marked nonterminals are allowed to accidentally slurp up
extra whitespace.

                                        Be seeing you,
                                          norm

--
Norm Tovey-Walsh
Saxonica

Received on Saturday, 26 March 2022 16:28:52 UTC