- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Mon, 12 Jun 2023 18:24:20 -0600
- To: ixml <public-ixml@w3.org>
Thank you, Norm, for this draft. I have some comments. - The sentence A conformant Invisible XML processor is only required to produce well-formed XML output, it is not otherwise constrained. makes me nervous. An ixml processor is not constrained to use any particular set of serialization options, or even to conform to the serialization spec (with whatever framing assumptions are needed to make that idea make sense), but it *is* constrained to produce XML representing a successful parse of the input, if there is such a parse. Perhaps say something like A conformant Invisible XML processor is required to produce well-formed XML output, but its choices in serializing the XML are not otherwise constrained. ? - In the section on hints for implementers, I think the reference to "the ability to round-trip XML data", without further elaboration, is unhelpful. In the context of the serialization spec, it's clear (at least on reflection) that the round trip starts with an XDM instance and goes through serialization to XML and through XML parsing back into an XDM instance. In an ixml context, the route back to the input format is undefined. Perhaps it would be better to talk about possible information loss? And mention that sometimes information loss is what is desired? An application has some latitude when serializing XML. Particular attention should be paid to serializing whitespace and other control characters. It should be noted, for example, that if characters #a and #d appear in a value to be serialized as an attribute and are serialized normally, the #a and #d characters in the value will be removed by the XML parser when it performs whitespace normalization on the attribute value. The sequence #a#d will similarly be translated to #a by standard XML parsing. If the user of the grammar expects to see the original characters in the XML output, it will be necessary to encode them using numeric character references when serializing the XML output. If on the other hand the user of does *not* expect to see the original characters in the output, then carefully preserving them using numeric character references is likely to be unhelpful. See [Serialization] for detailed discussions. - Now that I have started to worry about the applicability of the concept of round-tripping, I'm also uneasy about the sentence Some aspects of the serialization will impact whether or not the document can be perfectly reconstruc[t]ed by the XML parser. So for Some aspects of the serialization will impact whether or not the document can be perfectly reconstruced by the XML parser. perhaps read Some aspects of the serialization will impact whether or not all characters of the input (e.g. #a#d as a line separator, or either of those characters within attribute values) are retained after the serialized XML is parsed with a conforming XML parser. - For `reconstruced` read `reconstructed`. - We should discuss John Cowan's suggestion, which I understand as an attempt to take the option of preserving details line separators in the input off the table entirely, thus rendering moot almost everything in the current draft but the references to #a in attribute values. John's option would work fine for me under normal conditions. I am not quite sure what conditions might make me want something else, so I don't know what would make sense by way of an ability to turn off whitespace normalization. -- C. M. Sperberg-McQueen Black Mesa Technologies LLC http://blackmesatech.com
Received on Tuesday, 13 June 2023 00:44:34 UTC