- From: poot <cvsmail@w3.org>
- Date: Mon, 5 Apr 2010 13:37:12 +0900 (JST)
- To: public-html-diffs@w3.org
hixie: Be more compatible with what browsers do with multibyte
characters in submissions. (whatwg r4970)
http://dev.w3.org/cvsweb/html5/spec/Overview.html?r1=1.3992&r2=1.3993&f=h
http://html5.org/tools/web-apps-tracker?from=4969&to=4970
===================================================================
RCS file: /sources/public/html5/spec/Overview.html,v
retrieving revision 1.3992
retrieving revision 1.3993
diff -u -d -r1.3992 -r1.3993
--- Overview.html 4 Apr 2010 22:43:21 -0000 1.3992
+++ Overview.html 5 Apr 2010 04:36:56 -0000 1.3993
@@ -285,7 +285,7 @@
<h1>HTML5</h1>
<h2 class="no-num no-toc" id="a-vocabulary-and-associated-apis-for-html-and-xhtml">A vocabulary and associated APIs for HTML and XHTML</h2>
- <h2 class="no-num no-toc" id="editor-s-draft-4-april-2010">Editor's Draft 4 April 2010</h2>
+ <h2 class="no-num no-toc" id="editor-s-draft-5-april-2010">Editor's Draft 5 April 2010</h2>
<dl><dt>Latest Published Version:</dt>
<dd><a href="http://www.w3.org/TR/html5/">http://www.w3.org/TR/html5/</a></dd>
<dt>Latest Editor's Draft:</dt>
@@ -392,7 +392,7 @@
specification's progress along the W3C Recommendation
track.
- This specification is the 4 April 2010 Editor's Draft.
+ This specification is the 5 April 2010 Editor's Draft.
</p><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- relationship to other work (required) --><p>The contents of this specification are also part of <a href="http://www.whatwg.org/specs/web-apps/current-work/multipage/">a
specification</a> published by the <a href="http://www.whatwg.org/">WHATWG</a>, which is available under a
license that permits reuse of the specification text.</p><!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- required patent boilerplate --><p>This document was produced by a group operating under the <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5
@@ -34362,24 +34362,56 @@
<li>
<p>For each character in the entry's name and value, apply the
- following subsubsteps:</p>
+ appropriate subsubsteps from the following list:</p>
- <ol><!-- * - . _ 0-9 a-z A-Z --><li><p>If the character isn't in the range U+0020, U+002A,
+ <dl class="switch"><dt>The character is a U+0020 SPACE character</dt>
+
+ <dd>Replace the character with a single U+002B PLUS SIGN
+ character (+).</dd>
+
+
+ <!-- * - . _ 0-9 a-z A-Z -->
+
+ <dt>If the character isn't in the range U+0020, U+002A,
U+002D, U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005F,
- U+0061 to U+007A then replace the character with a string
- formed as follows: Start with the empty string, and then,
- taking each byte of the character when expressed in the
- selected character encoding in turn, append to the string a
- U+0025 PERCENT SIGN character (%) followed by two characters in
- the ranges U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9) and
- U+0041 LATIN CAPITAL LETTER A to U+0046 LATIN CAPITAL LETTER F
- representing the hexadecimal value of the byte (zero-padded if
- necessary).</li>
+ U+0061 to U+007A</dt>
- <li><p>If the character is a U+0020 SPACE character, replace it
- with a single U+002B PLUS SIGN character (+).</li>
+ <dd>
- </ol></li>
+ <p>Replace the character with a string formed as follows:</p>
+
+ <ol><li><p>Let <var title="">s</var> be an empty string.</li>
+
+ <li>
+
+ <p>For each byte <var title="">b</var> of the character when
+ expressed in the selected character encoding in turn, run
+ the appropriate subsubsubstep from the list below:</p>
+
+ <dl class="switch"><dt>If the byte is in the range 0x20, 0x2A, 0x2D, 0x2E,
+ 0x30 to 0x39, 0x41 to 0x5A, 0x5F, 0x61 to 0x7A</dt>
+
+ <dd><p>Append to <var title="">s</var> the Unicode
+ character with the codepoint equal to the byte.</dd>
+
+ <dt>Otherwise</dt>
+
+ <dd><p>Append to the string a U+0025 PERCENT SIGN character
+ (%) followed by two characters in the ranges U+0030 DIGIT
+ ZERO (0) to U+0039 DIGIT NINE (9) and U+0041 LATIN CAPITAL
+ LETTER A to U+0046 LATIN CAPITAL LETTER F representing the
+ hexadecimal value of the byte (zero-padded if
+ necessary).</dd>
+
+ </dl></li>
+
+ </ol></dd>
+
+ <dt>Otherwise</dt>
+
+ <dd><p>Leave the character as is.</dd>
+
+ </dl></li>
<li><p>If the entry's name is "<code title="">isindex</code>",
its type is "<code title="">text</code>", and this is the first
Received on Monday, 5 April 2010 04:37:40 UTC