comments on PR 179

Thank you, Norm, for this draft.  I have some comments.

- The sentence

      A conformant Invisible XML processor is only required to produce
      well-formed XML output, it is not otherwise constrained.

  makes me nervous.

  An ixml processor is not constrained to use any particular set of
  serialization options, or even to conform to the serialization spec
  (with whatever framing assumptions are needed to make that idea make
  sense), but it *is* constrained to produce XML representing a
  successful parse of the input, if there is such a parse.

  Perhaps say something like

      A conformant Invisible XML processor is required to produce
      well-formed XML output, but its choices in serializing the XML
      are not otherwise constrained.

  ? 

- In the section on hints for implementers, I think the reference to
  "the ability to round-trip XML data", without further elaboration, is
  unhelpful.  In the context of the serialization spec, it's clear (at
  least on reflection) that the round trip starts with an XDM instance
  and goes through serialization to XML and through XML parsing back
  into an XDM instance.  In an ixml context, the route back to the input
  format is undefined.

  Perhaps it would be better to talk about possible information loss?
  And mention that sometimes information loss is what is desired?

    An application has some latitude when serializing XML. Particular
    attention should be paid to serializing whitespace and other control
    characters.  It should be noted, for example, that if characters #a
    and #d appear in a value to be serialized as an attribute and are
    serialized normally, the #a and #d characters in the value will be
    removed by the XML parser when it performs whitespace normalization
    on the attribute value.  The sequence #a#d will similarly be
    translated to #a by standard XML parsing.  If the user of the
    grammar expects to see the original characters in the XML output, it
    will be necessary to encode them using numeric character references
    when serializing the XML output.  If on the other hand the user of
    does *not* expect to see the original characters in the output, then
    carefully preserving them using numeric character references is
    likely to be unhelpful.  See [Serialization] for detailed
    discussions.
  
- Now that I have started to worry about the applicability of the
  concept of round-tripping, I'm also uneasy about the sentence

      Some aspects of the serialization will impact whether or not the
      document can be perfectly reconstruc[t]ed by the XML parser.

  So for 

      Some aspects of the serialization will impact whether or not the
      document can be perfectly reconstruced by the XML parser.

  perhaps read 

      Some aspects of the serialization will impact whether or not all
      characters of the input (e.g. #a#d as a line separator, or either
      of those characters within attribute values) are retained after
      the serialized XML is parsed with a conforming XML parser.

- For `reconstruced` read `reconstructed`.

- We should discuss John Cowan's suggestion, which I understand as an
  attempt to take the option of preserving details line separators in
  the input off the table entirely, thus rendering moot almost
  everything in the current draft but the references to #a in attribute
  values.

  John's option would work fine for me under normal conditions.  I am
  not quite sure what conditions might make me want something else, so I
  don't know what would make sense by way of an ability to turn off
  whitespace normalization.
  
  
-- 
C. M. Sperberg-McQueen
Black Mesa Technologies LLC
http://blackmesatech.com

Received on Tuesday, 13 June 2023 00:44:34 UTC