- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Mon, 14 Feb 2022 18:07:58 -0700
- To: Steven Pemberton <steven.pemberton@cwi.nl>
- Cc: public-ixml@w3.org
Steven Pemberton writes: > In section "Parsing" > https://invisiblexml.org/ixml-specification.html#parsing > > Processors must accept and parse any conforming grammar, and produce > at least one parse of supplied input that matches the grammar starting > at the root symbol. > > ========= > In section Conformance of processors > https://invisiblexml.org/ixml-specification.html#L7076 > > For any conforming grammar and any input: > > * Processors must parse by default the entire input using the grammar, > determining in the process whether or not the input is described by > the grammar. Processors may provide user options for other behaviors, > such as parsing the largest, or smallest, prefix of the input that is > described by the grammar, or supporting invocation with input streams > of indeterminate length. > > * If the input is unambiguously described by the grammar, the > resulting parse tree must be serialized to an XML document. > > * If the root node in the grammar is marked as an attribute, > processors must ignore that marking whenever serializing the rule as > the root. > > * If more than one parse tree describes the input, the processor must > serialize one of them. It is not defined how this choice is made, > but the resulting parse should be marked as ambiguous by including on > the document element of the serialisation the attribute > ixml:state="ambiguous". Processors may provide a user option to > suppress that attribute; they may also provide a user option to > produce more than one parse tree. > > * If the parse fails, the processor must produce some XML document > with ixml:state="failed" on the document element, with helpful > information about where and why it failed; it may be a partial parse > tree that includes parts of the parse that succeeded. > > * If the parse succeeded, but without consuming the entirety of the > input, processors may choose either to produce a failure document as > described above, or to serialize the resulting parse tree with the > attribute ixml:state="prefix", or if the parse is ambiguous > ixml:state="ambiguous prefix". > > * The form in which XML documents are produced is not constrained by > this specification; processors should be capable of producing > serialized XML as a character stream, but other forms (e.g. DOM > instances or XDM instances) may also be used. > > Note: the requirements require that grammars be processed by an > algorithm that accepts and parses any context-free grammar; known > parsing algorithms of this class include [Earley], [Unger], [CYK], > [GLR], and [GLL]; see also [Grune]. In case other people, like me, have a little trouble tracking the changes made here, I attach an HTML document showing the changes I understand Steven to be proposing. My quick reactions: Mostly good; thank you. I don't understand the new ordering of topics. In the old text, I see an attempt to describe the core cases and then special cases and optional features; in the new text I don't see the principle underlying the sequence of items. -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Tuesday, 15 February 2022 01:08:20 UTC