[Bug 18394] New: Encoding Sniffing Algorithm: parent browsing context defines encoding default

https://www.w3.org/Bugs/Public/show_bug.cgi?id=18394

           Summary: Encoding Sniffing Algorithm: parent browsing context
                    defines encoding default
           Product: HTML WG
           Version: unspecified
          Platform: PC
               URL: http://dev.w3.org/html5/spec/Overview#encoding-sniffin
                    g-algorithm
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML5 spec
        AssignedTo: ian@hixie.ch
        ReportedBy: xn--mlform-iua@xn--mlform-iua.no
         QAContact: public-html-bugzilla@w3.org
                CC: mike@w3.org, public-html-wg-issue-tracking@w3.org,
                    public-html@w3.org


Proposal: Extend the encoding sniffing algorithm[1] with a new,
          2nd last step, like so:

     #. If the document lives in a 'nested browsing context'[2],
        then return the encoding of the 'parent browsing context',
        as a parent browsing context dictated default encoding,
        and abort these steps.

Bug #3: Justification.

   (1) Currently, the HTML5 encoding sniffing algorithm fails to take 
account of the fact that, in case the document of a nested browsing 
context has not been supplied with encoding information, then Web 
browsers[*] do *not* "return an implementation-defined or 
user-specified default character encoding" (as HTML5 currently 
requires). Web browsers instead return a 'parent browsing 
context-defined' character encoding - the encoding of the document in 
the parent browsing context.

     [*]I did not test the relevant editions of IE - IE8/IE9/IE10 - yet.
        But I know that IE6 does not consider the encoding of the parent
        browsing context.

   (2) By explicitly including the 'parent browsing context encoding 
default' into the algorithm, then we make sure that browser applies the 
default at the same step.
       The problem, right now, is that the browsers that thus far has 
implemented the encoding sniffing algorithm's current step 7 (encoding 
pattern matching/detection) disagree about whether it should take place 
*before* the parent browsing context default is applied — or *after* 
the encoding of the parent browsing context has been considered.
       The latter approach, which Chrome seems to take, means that step 
7 is unlikely to take place at all if the document lives in a nested 
browsing context. Firefox 12 (which by default only performs step 7 for 
some locales or at user request) and Opera 12 (which - unlike in at 
least Opera 10 - applies step 7 for all locales, take the approach that 
encoding pattern matching/detection should occur before the locale 
default eventually is applied.


For more, see the blog post I wrote in connection with this bug report.[3]

[1] http://dev.w3.org/html5/spec/Overview#encoding-sniffing-algorithm
[2] http://dev.w3.org/html5/spec/Overview#nested-browsing-context
[3] http://målform.no/blog/white-spots-in-html5-s-encoding-sniffing-algorithm

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Wednesday, 25 July 2012 12:26:35 UTC