[Bug 12605] section 4.10.22.5, step 4: first sub-step deals with U+0020 space, so U+0020 and 0x20 can be removed from the subsequent points from bugzilla@jessica.w3.org on 2011-05-05 (public-html-bugzilla@w3.org from May 2011)

From: <bugzilla@jessica.w3.org>
Date: Thu, 05 May 2011 12:18:35 +0000
To: public-html-bugzilla@w3.org
Message-Id: <E1QHxVr-0004Vu-SJ@jessica.w3.org>

http://www.w3.org/Bugs/Public/show_bug.cgi?id=12605

Hallvord R. M. Steen <hallvord@opera.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hallvord@opera.com

--- Comment #1 from Hallvord R. M. Steen <hallvord@opera.com> 2011-05-05 12:18:35 UTC ---
I think a part of the algorithm in step 4.4 is superfluous - there is one step
saying 'if the character isn't in the range"  and inside that if-block there is
another step saying 'if the character IS in the range" giving the exact same
range of character codes.

Wouldn't this:

       <!-- * - . _ 0-9 a-z A-Z -->

       <dt>If the character isn't in the range U+0020, U+002A,
       U+002D, U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005F,
       U+0061 to U+007A</dt>

       <dd>

        <p>Replace the character with a string formed as follows:</p>

        <ol><li><p>Let <var title="">s</var> be an empty string.</li>

         <li>

          <p>For each byte <var title="">b</var> of the character when
          expressed in the selected character encoding in turn, run
          the appropriate subsubsubstep from the list below:</p>

          <dl class="switch"><dt>If the byte is in the range 0x20, 0x2A, 0x2D,
0x2E,
           0x30 to 0x39, 0x41 to 0x5A, 0x5F, 0x61 to 0x7A</dt>

           <dd><p>Append to <var title="">s</var> the Unicode
           character with the code point equal to the byte.</dd>

           <dt>Otherwise</dt>

           <dd><p>Append to the string a U+0025 PERCENT SIGN character
           (%) followed by two characters in the ranges U+0030 DIGIT
           ZERO (0) to U+0039 DIGIT NINE (9) and U+0041 LATIN CAPITAL
           LETTER A to U+0046 LATIN CAPITAL LETTER F representing the
           hexadecimal value of the byte (zero-padded if
           necessary).</dd>

          </dl></li>

        </ol></dd>

       <dt>Otherwise</dt>

       <dd><p>Leave the character as is.</dd>

      </dl></li>

Be better written as 


       <!-- * - . _ 0-9 a-z A-Z -->

       <dt>If the character is in the range U+002A,
       U+002D, U+002E, U+0030 to U+0039, U+0041 to U+005A, U+005F,
       U+0061 to U+007A</dt>

       <dd><p>Leave the character as is.</dd>

       <dt>Otherwise</dt>

       <dd>

        <p>Replace the character with a string formed as follows:</p>

        <ol><li><p>Let <var title="">s</var> be an empty string.</li>

         <li>

          <p>For each byte <var title="">b</var> of the character when
          expressed in the selected character encoding in turn,  append 
          to the string a U+0025 PERCENT SIGN character
           (%) followed by two characters in the ranges U+0030 DIGIT
           ZERO (0) to U+0039 DIGIT NINE (9) and U+0041 LATIN CAPITAL
           LETTER A to U+0046 LATIN CAPITAL LETTER F representing the
           hexadecimal value of the byte (zero-padded if
           necessary).
          </li>

        </ol></dd>



      </dl></li>

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Thursday, 5 May 2011 12:18:37 UTC