[Bug 11045] New: U+0020 does not need to be escaped in srcdoc in XML


           Summary: U+0020 does not need to be escaped in srcdoc in XML
           Product: HTML WG
           Version: unspecified
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML5 spec (editor: Ian Hickson)
        AssignedTo: ian@hixie.ch
        ReportedBy: simonp@opera.com
         QAContact: public-html-bugzilla@w3.org
                CC: mike@w3.org, public-html-wg-issue-tracking@w3.org,

The new text for the Working Group Decision on ISSUE-103 srcdoc-xml-escaping
says that U+0020 needs to be escaped.

<p class="note">Due to restrictions of <span>the XML syntax</span>,
- in XML a number of other characters need to be escaped also to
- ensure correctness.</p>
+ in XML the U+003C LESS-THAN SIGN character (&lt;) needs to be
+ escaped as well. In order to prevent <a
+ href="http://www.w3.org/TR/REC-xml/#AVNormalize">attribute-value
+ normalization</a>, XML's whitespace characters &mdash; U+0009
+ RETURN (CR) and U+0020 SPACE &mdash; also need to be escaped. <a
+ href="#refsXML">[XML]</a></p>

My reading of the XML spec suggests space does not need to be escaped.


"For a white space character (#x20, #xD, #xA, #x9), append a space
character (#x20) to the normalized value."

i.e. a literal space and an escaped space results in the same thing.

The paragraph "If the attribute type is not CDATA, then the XML
processor MUST further process the normalized attribute value by
discarding any leading and trailing space (#x20) characters, and by
replacing sequences of space (#x20) characters by a single space (#x20)
character." does not apply since srcdoc is a CDATA attribute (and none of the
allowed doctypes change that for srcdoc even if the UA uses a validating

Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Thursday, 14 October 2010 10:33:01 UTC