W3C home > Mailing lists > Public > public-html@w3.org > October 2010

[Bug 11045] New: U+0020 does not need to be escaped in srcdoc in XML

From: <bugzilla@jessica.w3.org>
Date: Thu, 14 Oct 2010 10:32:59 +0000
To: public-html@w3.org
Message-ID: <bug-11045-2495@http.www.w3.org/Bugs/Public/>
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11045

           Summary: U+0020 does not need to be escaped in srcdoc in XML
           Product: HTML WG
           Version: unspecified
          Platform: PC
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML5 spec (editor: Ian Hickson)
        AssignedTo: ian@hixie.ch
        ReportedBy: simonp@opera.com
         QAContact: public-html-bugzilla@w3.org
                CC: mike@w3.org, public-html-wg-issue-tracking@w3.org,
                    public-html@w3.org


The new text for the Working Group Decision on ISSUE-103 srcdoc-xml-escaping
says that U+0020 needs to be escaped.

<p class="note">Due to restrictions of <span>the XML syntax</span>,
- in XML a number of other characters need to be escaped also to
- ensure correctness.</p>
+ in XML the U+003C LESS-THAN SIGN character (&lt;) needs to be
+ escaped as well. In order to prevent <a
+ href="http://www.w3.org/TR/REC-xml/#AVNormalize">attribute-value
+ normalization</a>, XML's whitespace characters &mdash; U+0009
+ CHARACTER TABULATION (HT), U+000A LINE FEED (LF), U+000D CARRIAGE
+ RETURN (CR) and U+0020 SPACE &mdash; also need to be escaped. <a
+ href="#refsXML">[XML]</a></p>

My reading of the XML spec suggests space does not need to be escaped.

http://www.w3.org/TR/REC-xml/#AVNormalize

"For a white space character (#x20, #xD, #xA, #x9), append a space
character (#x20) to the normalized value."

i.e. a literal space and an escaped space results in the same thing.

The paragraph "If the attribute type is not CDATA, then the XML
processor MUST further process the normalized attribute value by
discarding any leading and trailing space (#x20) characters, and by
replacing sequences of space (#x20) characters by a single space (#x20)
character." does not apply since srcdoc is a CDATA attribute (and none of the
allowed doctypes change that for srcdoc even if the UA uses a validating
processor).

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Received on Thursday, 14 October 2010 10:33:01 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:20 UTC