Re: ACTION: Steven to draft a concrete proposal for [prefix handling spec text]

Steven Pemberton writes:

> In section "Parsing"
> https://invisiblexml.org/ixml-specification.html#parsing
>
> Processors must accept and parse any conforming grammar, and produce
> at least one parse of supplied input that matches the grammar starting
> at the root symbol.
>
> =========
> In section Conformance of processors
> https://invisiblexml.org/ixml-specification.html#L7076
>
> For any conforming grammar and any input:
>
> * Processors must parse by default the entire input using the grammar,
>   determining in the process whether or not the input is described by
>  the grammar. Processors may provide user options for other behaviors,
> such as parsing the largest, or smallest, prefix of the input that is
> described by the grammar, or supporting invocation with input streams
> of indeterminate length.
>
> * If the input is unambiguously described by the grammar, the
>   resulting parse tree must be serialized to an XML document.
>
> * If the root node in the grammar is marked as an attribute,
>   processors must ignore that marking whenever serializing the rule as
>  the root.
>
> * If more than one parse tree describes the input, the processor must
>   serialize one of them. It is not defined how this choice is made,
>  but the resulting parse should be marked as ambiguous by including on
> the document element of the serialisation the attribute
> ixml:state="ambiguous". Processors may provide a user option to
> suppress that attribute; they may also provide a user option to
> produce more than one parse tree.
>
> * If the parse fails, the processor must produce some XML document
>   with ixml:state="failed" on the document element, with helpful
>  information about where and why it failed; it may be a partial parse
> tree that includes parts of the parse that succeeded.
>
> * If the parse succeeded, but without consuming the entirety of the
>   input, processors may choose either to produce a failure document as
>  described above, or to serialize the resulting parse tree with the
> attribute ixml:state="prefix", or if the parse is ambiguous
> ixml:state="ambiguous prefix".
>
> * The form in which XML documents are produced is not constrained by
>   this specification; processors should be capable of producing
>  serialized XML as a character stream, but other forms (e.g. DOM
> instances or XDM instances) may also be used.
>
> Note: the requirements require that grammars be processed by an
> algorithm that accepts and parses any context-free grammar; known
> parsing algorithms of this class include [Earley], [Unger], [CYK],
> [GLR], and [GLL]; see also [Grune].

In case other people, like me, have a little trouble tracking the
changes made here, I attach an HTML document showing the changes I
understand Steven to be proposing.

My quick reactions:

Mostly good; thank you.

I don't understand the new ordering of topics.  In the old text, I see
an attempt to describe the core cases and then special cases and
optional features; in the new text I don't see the principle underlying
the sequence of items.

-- 
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com

Received on Tuesday, 15 February 2022 01:08:20 UTC