- From: Maciej Stachowiak <mjs@apple.com>
- Date: Tue, 23 Mar 2010 20:43:09 -0700
- To: Julian Reschke <julian.reschke@gmx.de>
- Cc: Philip Taylor <pjt47@cam.ac.uk>, Anne van Kesteren <annevk@opera.com>, public-html@w3.org
On Mar 18, 2010, at 5:09 AM, Julian Reschke wrote: > On 18.03.2010 11:47, Philip Taylor wrote: >> Anne van Kesteren wrote: >>> On Thu, 18 Mar 2010 11:26:48 +0100, Julian Reschke >>> <julian.reschke@gmx.de> wrote: >>>> Replace the last sentence by: >>>> >>>> "Note: Due to restrictions of the XML syntax, in XML the U+003C >>>> LESS-THAN SIGN (<) needs be escaped as well." >>> >>> That seems incomplete. The sequence ]]> comes to mind. >> >> That's not an issue in attribute values, as far as I'm aware. >> >> But in attribute values, U+000D and U+000A and U+0009 must be escaped >> too. (Depending on DTD you might also need to escape any leading or >> trailing U+0020 and at least one of any adjacent pair of U+0020s, I >> think.) > > Ah, good catch. Updated proposal below. Thanks for the Change Proposal. Recorded: http://dev.w3.org/html5/status/issue-status.html#ISSUE-0103 Regards, Maciej > > BR, Julian > > -- snip -- > > SUMMARY > > Specification is needlessly vague about XML escaping requirements > when discussing iframe/@srcdoc. > > RATIONALE > > Spec should properly balance considerations for text/html and > application/xhtml+xml. If the requirements are spelled out for the > former the same should be done for the latter. > > DETAILS > > Spec currently says: > > "Note: In the HTML syntax, authors need only remember to use U+0022 > QUOTATION MARK characters (") to wrap the attribute contents and > then to escape all U+0022 QUOTATION MARK (") and U+0026 AMPERSAND > (&) characters, and to specify the sandbox attribute, to ensure > safe embedding of content. > > Note: Due to restrictions of the XML syntax, in XML a number of > other characters need to be escaped also to ensure correctness." > > Replace the last sentence by: > > "Note: Due to restrictions of the XML syntax, in XML the U+003C LESS- > THAN SIGN (<) needs be escaped as well. Also, XML's whitespace > characters -- U+0009 CHARACTER TABULATION (HT), U+000A LINE FEED > (LF), U+000D CARRIAGE RETURN (CR) and U+0020 SPACE -- need to be > escaped in order to prevent attribute-value normalization ([XML], > Section 3.3.3)." > > IMPACT > > 1. Positive Effects > > More clarity about the XML syntax; equal treatment of both formats. > > 2. Negative Effects > > Repeats information that already is defined somewhere else, but this > applies to the paragraph about HTML as well. > > 3. Conformance Classes Changes > > None. > > 4. Risks > > The statement might not be totally accurate, in which case we can > use the regular review and bug fixing process to get it right. That > being said I believe it is accurate, as it's not about encoding > characters in XML in general, but just about *additional* > requirements for attribute values. > > REFERENCES > > None. > >
Received on Wednesday, 24 March 2010 03:43:44 UTC