Re: Auto-detect and encodings in HTML5

Philip Taylor On 09-05-29 02.13:
> Leif Halvard Silli wrote:
>> John Cowan On 09-05-28 23.08:
>>> Leif Halvard Silli scripsit:
>>>
>>>> <meta name="Title" charset="Beagle Kennel van der Liniehoeve">
>>>
>>> Well, this does say "charset" rather than "content".
>>
>> Yes, currently HTML doesn't have any @charset attribute. @charset is 
>> only a new invention of the HTML 5 draft.
> 
> (It's newly specified in HTML 5, but it's been supported by the major 
> web browsers for practically forever.)

Interesting how few pages that used it, though. I really don't 
know if speccing it makes anything any clearer for anyone.

>> if I read the data correctly, then the HTML 5 draft algorithm that 
>> Philip used, was unable to decode the correct charset info in the 
>> _first_ meta element.
> 
> I looked for the first charset in a <meta content>, and independently 
> looked for the first <meta charset>, so that particular page was counted 
> in both of those columns of the table. The "sniffer" column is the one 
> that matched the algorithm in HTML 5, which stops after finding the 
> first thing that looks like a charset specification, and for this page 
> it reported windows-1252.

... may be I just don't understand the presentation: the caption 
of the table says: "Number of pages declaring encoding (% decoded 
without errors)" About the "beagle" page in particular, the 
different columns say:

HTTP: U0; meta content: 0; Sniffer: 0; meta charset: 1 (0%);

?
-- 
leif halvard silli

Received on Friday, 29 May 2009 03:15:41 UTC