[whatwg] Default encoding to UTF-8?

Leif Halvard Silli Sun Dec 11 03:21:40 PST 2011

> W.r.t. iframe, then the "big in Norway" newspaper Dagbladet.no is 
> declared ISO-8859-1 encoded and it includes a least one ads-iframe that 
  ...
> * Let's say that I *kept* ISO-8859-1 as default encoding, but instead 
> enabled the Universal detector. The frame then works.
> * But if I make the frame page very short, 10 * the letter "?" as 
> content, then the Universal detector fails - on a test on my own 
> computer, it guess the page to be Cyrillic rather than Norwegian.
> * What's the problem? The Universal detector is too greedy - it tries 
> to fix more problems than I have. I only want it to guess on "UTF-8". 
> And if it doesn't detect UTF-8, then it should fall back to the locale 
> default (including fall back to the encoding of the parent frame).

The above illustrates that the current charset-detection solutions are 
starting to get old: They are not geared and optimized towards UTF-8 as 
the firmly recommended and - in principle - anticipated default.

The above may also catch a real problem with switching to UTF-8: that 
one may need to embed pages which do not use UTF-8: If one could trust 
UAs to attempt UTF-8 detection (but not "Univeral detection) before 
defaulting, then it became virtually risk free to switch a page to 
UTF-8, even if it contains iframe pages. Not?

Leif H Silli

Received on Sunday, 11 December 2011 03:44:37 UTC