- From: <bugzilla@jessica.w3.org>
- Date: Mon, 13 Apr 2015 09:54:08 +0000
- To: public-qt-comments@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=28479 Bug ID: 28479 Summary: [Ser 3.1] Character Maps Product: XPath / XQuery / XSLT Version: Last Call drafts Hardware: PC OS: All Status: NEW Severity: normal Priority: P2 Component: Serialization 3.1 Assignee: cmsmcq@blackmesatech.com Reporter: mike@saxonica.com QA Contact: public-qt-comments@w3.org Some changes have occurred in the 3.0 and 3.1 specs regarding character maps, whose implications do not appear to have been fully thought through. In 3.0, item-separator was added. There's no clear indication as to whether character-map substitution is applied before or after insertion of item separators. The closest we get is: Character mapping is applied to the characters that actually appear in a text or attribute node in the instance of the data model, before any other serialization operations such as escaping or Unicode Normalization are applied. which I propose we change to: Character mapping is applied to the characters that actually appear in a text or attribute node in the instance of the data model, before any other serialization operations such as sequence normalization, escaping, or Unicode normalization are applied. In 3.1, character mapping is applied to strings, as well as to text and attribute nodes. This change was presumably intended primarily for JSON, though it's not clear quite what the expected use case is. We say "If a character is mapped, then it is not subjected to XML or HTML escaping, nor to Unicode Normalization." I would think that for character maps to be useful with JSON, this should say "XML or HTML or JSON escaping" (indeed, the rest of the paragraph could be interpreted as implying this). (We actually say thrice that it is not subjected to XML or HTML escaping. Presumably this is on the theory that "what I say three times is true"). I'm slightly worried that the extension of character maps to apply to strings causes a backwards incompatibility for the XML and HTML output methods. In XSLT 2.0 it wasn't possible for the XSLT processor to send a string to the serializer: only XML result trees were sent, which means any string would be turned into a text node which would be subject to character mapping. But XQuery 3.0 could certainly send a string to an XML or HTML output method, and if we accept that character mapping was supposed to happen before sequence normalization, then character maps would not be applied to the string. Finally, for the JSON case, I think it's not quite precise enough to say that character mapping applies to "strings". We treat miscellaneous data types such as dates, times, anyURIs and untypedAtomics by conversion to strings: does it apply to these? Does it apply to the keys in maps as well as the values? I would also point out that the idea of applying character maps early in the serialization pipeline, and then treating mapped characters differently from unmapped characters in later stages of the pipeline, is very messy from an implementation viewpoint. We're saddled with this for XML and HTML serialization, but do we really want to do this for JSON? -- You are receiving this mail because: You are the QA Contact for the bug.
Received on Monday, 13 April 2015 09:54:10 UTC