- From: Ian Hickson via cvs-syncmail <cvsmail@w3.org>
- Date: Mon, 05 Apr 2010 04:37:00 +0000
- To: public-html-commits@w3.org
Update of /sources/public/html5/spec In directory hutz:/tmp/cvs-serv17531 Modified Files: Overview.html Log Message: Be more compatible with what browsers do with multibyte characters in submissions. (whatwg r4970) Index: Overview.html =================================================================== RCS file: /sources/public/html5/spec/Overview.html,v retrieving revision 1.3992 retrieving revision 1.3993 diff -u -d -r1.3992 -r1.3993 --- Overview.html 4 Apr 2010 22:43:21 -0000 1.3992 +++ Overview.html 5 Apr 2010 04:36:56 -0000 1.3993 @@ -285,7 +285,7 @@ <h1>HTML5</h1> <h2 class="no-num no-toc" id="a-vocabulary-and-associated-apis-for-html-and-xhtml">A vocabulary and associated APIs for HTML and XHTML</h2> - <h2 class="no-num no-toc" id="editor-s-draft-4-april-2010">Editor's Draft 4 April 2010</h2> + <h2 class="no-num no-toc" id="editor-s-draft-5-april-2010">Editor's Draft 5 April 2010</h2> <dl><dt>Latest Published Version:</dt> <dd><a href="http://www.w3.org/TR/html5/">http://www.w3.org/TR/html5/</a></dd> <dt>Latest Editor's Draft:</dt> @@ -392,7 +392,7 @@ specification's progress along the W3C Recommendation track. - This specification is the 4 April 2010 Editor's Draft. + This specification is the 5 April 2010 Editor's Draft. </p><!-- UNDER NO CIRCUMSTANCES IS THE PRECEDING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- relationship to other work (required) --><p>The contents of this specification are also part of <a href="http://www.whatwg.org/specs/web-apps/current-work/multipage/">a specification</a> published by the <a href="http://www.whatwg.org/">WHATWG</a>, which is available under a license that permits reuse of the specification text.</p><!-- UNDER NO CIRCUMSTANCES IS THE FOLLOWING PARAGRAPH TO BE REMOVED OR EDITED WITHOUT TALKING TO IAN FIRST --><!-- required patent boilerplate --><p>This document was produced by a group operating under the <a href="http://www.w3.org/Consortium/Patent-Policy-20040205/">5 @@ -34362,24 +34362,56 @@ <li> <p>For each character in the entry's name and value, apply the - following subsubsteps:</p> + appropriate subsubsteps from the following list:</p> - <ol><!-- * - . _ 0-9 a-z A-Z --><li><p>If the character isn't in the range U+0020, U+002A, + <dl class="switch"><dt>The character is a U+0020 SPACE character</dt> + + <dd>Replace the character with a single U+002B PLUS SIGN + character (+).</dd> + + + <!-- * - . _ 0-9 a-z A-Z --> + + <dt>If the character isn't in the range U+0020, U+002A, U+002D, U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005F, - U+0061 to U+007A then replace the character with a string - formed as follows: Start with the empty string, and then, - taking each byte of the character when expressed in the - selected character encoding in turn, append to the string a - U+0025 PERCENT SIGN character (%) followed by two characters in - the ranges U+0030 DIGIT ZERO (0) to U+0039 DIGIT NINE (9) and - U+0041 LATIN CAPITAL LETTER A to U+0046 LATIN CAPITAL LETTER F - representing the hexadecimal value of the byte (zero-padded if - necessary).</li> + U+0061 to U+007A</dt> - <li><p>If the character is a U+0020 SPACE character, replace it - with a single U+002B PLUS SIGN character (+).</li> + <dd> - </ol></li> + <p>Replace the character with a string formed as follows:</p> + + <ol><li><p>Let <var title="">s</var> be an empty string.</li> + + <li> + + <p>For each byte <var title="">b</var> of the character when + expressed in the selected character encoding in turn, run + the appropriate subsubsubstep from the list below:</p> + + <dl class="switch"><dt>If the byte is in the range 0x20, 0x2A, 0x2D, 0x2E, + 0x30 to 0x39, 0x41 to 0x5A, 0x5F, 0x61 to 0x7A</dt> + + <dd><p>Append to <var title="">s</var> the Unicode + character with the codepoint equal to the byte.</dd> + + <dt>Otherwise</dt> + + <dd><p>Append to the string a U+0025 PERCENT SIGN character + (%) followed by two characters in the ranges U+0030 DIGIT + ZERO (0) to U+0039 DIGIT NINE (9) and U+0041 LATIN CAPITAL + LETTER A to U+0046 LATIN CAPITAL LETTER F representing the + hexadecimal value of the byte (zero-padded if + necessary).</dd> + + </dl></li> + + </ol></dd> + + <dt>Otherwise</dt> + + <dd><p>Leave the character as is.</dd> + + </dl></li> <li><p>If the entry's name is "<code title="">isindex</code>", its type is "<code title="">text</code>", and this is the first
Received on Monday, 5 April 2010 04:37:02 UTC