- From: Richard Tobin <richard@inf.ed.ac.uk>
- Date: Tue, 13 May 2008 14:02:52 +0100 (BST)
- To: public-xml-processing-model-wg@w3.org
> [NEW] ACTION: Richard to attempt to clarify the prose of the > unescape-markup with respect to the XML Declaration, document types, XML > version, etc. [recorded in > http://www.w3.org/2008/05/08-xproc-minutes.html#action03[14]] I looked at the existing description of p:unescape-markup and was surprised to see that it says: When the string value is parsed, the original document element is preserved so that the result will be well-formed XML even if the content consists of multiple, sibling elements. That is, the text is parsed as an external entity rather than an XML document. This implies that it can't have a DOCTYPE - we don't want to invent a new kind of document that's effectively an XML document with multiple top-level elements allowed. Why do we allow this? Is it just because p:escape-markup can produce such things (because it serializes the children of the document element)? Do we really want it? If p:escape-markup produces and p:unescape-markup consumes external entities rather than XML documents, this raises various issues about the xml declaration. For a non-document entity, it's actually a text declaration, so "standalone" is not allowed and "encoding" is required (even though the encoding is irrelevant, since we're dealing with characters not bytes). Do we say it's up to the user to ensure that the serialization options produce a legal serialised result? -- Richard -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
Received on Tuesday, 13 May 2008 13:03:37 UTC