[Bug 11426] New: Meta prescan should run on the first 1024 bytes

http://www.w3.org/Bugs/Public/show_bug.cgi?id=11426

           Summary: Meta prescan should run on the first 1024 bytes
           Product: HTML WG
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML5 spec (editor: Ian Hickson)
        AssignedTo: ian@hixie.ch
        ReportedBy: hsivonen@iki.fi
         QAContact: public-html-bugzilla@w3.org
                CC: mike@w3.org, public-html-wg-issue-tracking@w3.org,
                    public-html@w3.org


http://www.whatwg.org/specs/web-apps/current-work/#determining-the-character-encoding

The spec says:
"The user agent may wait for more bytes of the resource to be available, either
in this step or at any later step in this algorithm. For instance, a user agent
might wait 500ms or 512 bytes, whichever came first. In general preparsing the
source to find the encoding improves performance, as it reduces the need to
throw away the data structures used when parsing upon finding the encoding
information. However, if the user agent delays too long to obtain data to
determine the encoding, then the cost of the delay could outweigh any
performance improvements from the preparse."

First, the spec should suggest 1024 bytes instead of 512. Second, for
predictable results, the spec should probably require the prescan to inspect
the first 1024 (stopping earlier if an internal encoding declaration is found
earlier).

(It follows that if the server sends 1023 unlabeled bytes and then lets the
connection stall, nothing is rendered while the connection stalls.)

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Monday, 29 November 2010 13:04:23 UTC