[Bug 14022] New: Common microsyntax idiom "to strictly split a string on a particular delimiter character" is incorrectly described.

http://www.w3.org/Bugs/Public/show_bug.cgi?id=14022

           Summary: Common microsyntax idiom "to strictly split a string
                    on a particular delimiter character" is incorrectly
                    described.
           Product: HTML WG
           Version: unspecified
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML5 Reference (editor: Lachlan Hunt)
        AssignedTo: lachlan.hunt@lachy.id.au
        ReportedBy: rmizkur@yahoo.com
         QAContact: public-html-bugzilla@w3.org
                CC: lachlan.hunt@lachy.id.au, mike@w3.org,
                    public-html-wg-issue-tracking@w3.org,
                    public-html@w3.org


In editor's draft, 2011-08-29, Common infrastructure / Common microsyntaxes /
Common parser idioms, the algorithm "to strictly split a string on a particular
delimiter character" is incorrectly described.  As described, it creates an
infinite loop at the first delimiter.

This is based on an interpretation that the call in step 4.1 to the algorithm
to "Collect a sequence of characters", specifically step 3 of that algorithm,
returns with position at the next location after the last character collected,
i.e. pointing at a delimiter if one was encountered.

Suggested resolution:

Add a step 4.3 to advance position, so that step 4 reads:

  4. While position is not past the end of input:
      1. Collect a sequence of characters that are not the delimiter character.
      2. Add the string collected in the previous step to tokens.
      3. Advance position to the next character in input.
           [italicize "position" and "input"]

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Sunday, 4 September 2011 08:38:48 UTC