ACTION: Steven to draft a concrete proposal for [prefix handling spec text]

In section "Parsing"
https://invisiblexml.org/ixml-specification.html#parsing

Processors must accept and parse any conforming grammar, and produce at 
least one parse of supplied input that matches the grammar starting at the 
root symbol.

=========
In section Conformance of processors
https://invisiblexml.org/ixml-specification.html#L7076

For any conforming grammar and any input:

* Processors must parse by default the entire input using the grammar, 
determining in the process whether or not the input is described by the 
grammar. Processors may provide user options for other behaviors, such as 
parsing the largest, or smallest, prefix of the input that is described by 
the grammar, or supporting invocation with input streams of indeterminate 
length.

* If the input is unambiguously described by the grammar, the resulting 
parse tree must be serialized to an XML document.

* If the root node in the grammar is marked as an attribute, processors 
must ignore that marking whenever serializing the rule as the root.

* If more than one parse tree describes the input, the processor must 
serialize one of them. It is not defined how this choice is made, but the 
resulting parse should be marked as ambiguous by including on the document 
element of the serialisation the attribute ixml:state="ambiguous". 
Processors may provide a user option to suppress that attribute; they may 
also provide a user option to produce more than one parse tree.

* If the parse fails, the processor must produce some XML document with 
ixml:state="failed" on the document element, with helpful information about 
where and why it failed; it may be a partial parse tree that includes parts 
of the parse that succeeded.

* If the parse succeeded, but without consuming the entirety of the input, 
processors may choose either to produce a failure document as described 
above, or to serialize the resulting parse tree with the attribute 
ixml:state="prefix", or if the parse is ambiguous ixml:state="ambiguous 
prefix".

* The form in which XML documents are produced is not constrained by this 
specification; processors should be capable of producing serialized XML as 
a character stream, but other forms (e.g. DOM instances or XDM instances) 
may also be used.

Note: the requirements require that grammars be processed by an algorithm 
that accepts and parses any context-free grammar; known parsing algorithms 
of this class include [Earley], [Unger], [CYK], [GLR], and [GLL]; see also 
[Grune].

Received on Monday, 14 February 2022 22:06:21 UTC