Re: Attribute markings - a question

Thank you; I had neglected, overlooked, or forgotten the words "regardless of marking of intermediate nonterminals”.  

Michael

> On 13,Oct2021, at 3:16 AM, Steven Pemberton <steven.pemberton@cwi.nl> wrote:
> 
> It may be just early in the morning and the coffee hasn't yet kicked in, but I don't see the problem.
> 
> I checked in my implementation, making the grammar unambiguous in the process:
> 
>  S : @able, baker, @charlie.
>  able: string.
>  baker: string.
>  charlie: string.
>  string: ["abc"]*, ".".
> 
> Input:
>  aaa.bbb.ccc.
> 
> Result:
>  <S able='aaa.' charlie='ccc.'>
>     <baker>
>        <string>bbb.</string>
>     </baker>
>  </S>
> 
> Which was what I was expecting.
> 
> So assuming I'm not missing something obvious, I suspect that you need to reread the serialisation section of the spec:
> 
> "
>  • A nonterminal attribute is serialised by outputting the name of the node as an attribute, and serialising all non-hidden terminal descendants of the node (regardless of marking of intermediate nonterminals), in order, as the value of the attribute.
> "
> which I think covers what you are asking for.
> 
> The other side of this coin is:
> 
> "
>  • A nonterminal element is serialised by outputting the name of the node as an XML tag, serialising all exposed attribute descendants, and then serialising all non-attribute children in order. An attribute is exposed if it is an attribute child, or an exposed attribute of a hidden element child (note this is recursive).
> "
> 
> Steven
> 
> On Wednesday 13 October 2021 04:19:52 (+02:00), C. M. Sperberg-McQueen wrote:
> 
> > Consider the grammar 
> > 
> > S : @able, baker, @charlie.
> > able: string.
> > baker: string.
> > charlie: string.
> > string: ~[]*.
> > 
> > Is this grammar OK? (Yes, it’s hopelessly ambiguous, but that’s beside the point.)
> > 
> > If we ignored the annotations, a raw parse tree for this grammar might look like this:
> > 
> > <S>
> > <able mark=“@"><string>aaa</string></able>
> > <baker><string>aaa</string></able>
> > <charlie mark=“@"><string>ccc</string></able>
> > </S>
> > 
> > Note that ‘string’ is implicitly marked serializable (^). 
> > 
> > When a nonterminal marked to be serialized as an element appears as a child of a nonterminal marked to be serialized as an attribute (as ’string’ here appears as a child of @able and @charlie), is the rule 
> > 
> > - Raise an error because the grammar cannot be serialized that way?
> > 
> > - Omit the content of ’string’ from the value of @able and @charlie by analogy with what happens when we calculate the text node children of an element?
> > 
> > - Ignore the marking on ’string’ on the grounds that we have already been told that @able is an attribute. Since elements cannot appear within attributes, the implicid ^ marking on ’string’ is ignored.
> > 
> > The grammar for ixml offers two examples that seem relevant: in a raw parse tree, @name will dominate nodes labeled namestart and namefollower, which are explicitly marked non-serializable (-). @dstring and @sstring similarly dominate nodes labeled dchar and schar, which are implicitly marked ^. The attributes @from and @to directly dominate nodes labeled ‘character’ (marked -) and indirectly dominate nodes labeled ‘dchar’ and ’schar’ (implicitly ^).
> > 
> > In the spirit of making things as simple as possible for the grammar authors, I suppose the right rule is “when constructing the value of an attribute, treat nonterminals marked ^ and - the same: recur through them” (the last possibility mentioned above).
> > 
> > I apologize if this has been discussed before - I have the guilty sensation that it has been, and that I did not retain the answer.
> > 
> > Michael
> > 
> > 
> > 
> >

Received on Wednesday, 13 October 2021 15:10:46 UTC