W3C home > Mailing lists > Public > public-html@w3.org > July 2012

[Bug 18397] New: Encoding Sniffing Algorithm: Clarify what "infoformation on the likely encoding" covers

From: <bugzilla@jessica.w3.org>
Date: Wed, 25 Jul 2012 13:51:25 +0000
To: public-html@w3.org
Message-ID: <bug-18397-2495@http.www.w3.org/Bugs/Public/>

           Summary: Encoding Sniffing Algorithm: Clarify what
                    "infoformation on the likely encoding" covers
           Product: HTML WG
           Version: unspecified
          Platform: PC
               URL: http://dev.w3.org/html5/spec/Overview#encoding-sniffin
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML5 spec
        AssignedTo: ian@hixie.ch
        ReportedBy: xn--mlform-iua@xn--mlform-iua.no
         QAContact: public-html-bugzilla@w3.org
                CC: mike@w3.org, public-html-wg-issue-tracking@w3.org,

Please clarify what the step 'information on the likely encoding" covers.

For instance, does it cover the XML encoding declaration? Why? Why not?

In 2012, Chrome, Safari and Opera 12 still reads the XML encoding declaration
when/if the HTMl encoding declaration is lacking. 

In october 2009, Ian Hickson wrote: "So in the absence of more compelling
reasons to add this, I'd rather get  Opera and WebKit to remove the support for
this, than add more" [1]

However, it seems to me that the step "information on the likely encoding"
would cover their asses. After all, the presence of <?xml version="1.0"
encoding="UTF-8" ?> increases the chance that the encoding is UTF-8. May be the
algorithm could be specific on what is allowed and what is not allowed in this

The spec should therefore offer more data on what this step of the sniffing
algorithm refers to. Also see my blog post for more data.[2]

[2] http://målform.no/blog/white-spots-in-html5-s-encoding-sniffing-algorithm

Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Received on Wednesday, 25 July 2012 13:51:31 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:16:25 UTC