- From: Kornel Lesinski <kornel@geekhood.net>
- Date: Sat, 03 Apr 2010 17:34:20 +0100
- To: "Julian Reschke" <julian.reschke@gmx.de>
- Cc: "public-html@w3.org" <public-html@w3.org>
On Sat, 03 Apr 2010 10:00:32 +0100, Julian Reschke <julian.reschke@gmx.de>
wrote:
>> I agree that difference between http-equiv and HTTP is constant source
>> of confusion. Authors mistakenly think it is equivalent of HTTP headers,
>> and that most/all HTTP headers would work that way (e.g. there's lots of
>> documents with HTTP cache directives in HTML).
>>
>> Obviously, it's not an HTTP header equivalent (unless HTTP will require
>> HTTP clients to parse HTML) – the name is very misleading.
>
> Why would HTTP want to make requirements on HTML processing?
Because without parsing of HTML at some point (by an HTTP server or
proxies and clients) <meta> won't affect HTTP (e.g. content negotiation
with Vary: Content-Language may cause invalid version to be cached if only
HTML pragma is used), so it's not really an HTTP header equivalent, it's
something else that only superficially looks like HTTP header.
HTML5 defines http-equiv to contain specific values and HTTP-like pragmas
registered in WHATWG registry under certain conditions, and not simply
HTTP headers.
> I think we already heard about these use cases. Just because *browsers*
> do not support them doesn't mean that it's not used in other frameworks,
> and there's really no reason to make those documents non-compliant.
Could you point out such frameworks? How would they use such vague
information in a useful way?
I've grepped over 600000 documents from dotnetdotcom.org. Found 52835
content-language pragmas, and of these only 867 had a comma (same method
finds 361666 content-type pragmas).
These are most popular values (after normalization of case and whitespace):
100 nl,en
77 de,at,ch
39 fr,en
34 en-us,english
33 de,deutsch
26 fr,fr-be,fr-ca,fr-lu,fr-ch
26 en,us
26 de,ch,at
20 de,en
18 fr,fr-be,fr-ch,fr-lu,fr-mc
18 de,at
17 it,it-ch
16 nl,nl-be
15 el,en
14 deutsch,de
12 it,en,fr,de,es
12 en,th
12 de,de-ch,de-at,de-lu,de-li
11 es,es-es
10 pt,pt-pt
9 en-us,en-ca,en-au,en-bz,en-jm,en-nz,en-ph,en-tt
8 german,deutsch,allemand,de
8 en,us,fr,de,es,ca,nl,dk,it,pt,pl
8 en,en-us
7 en-us,en
6 zh,zh-hk,zh-cn,zh-sg,zh-tw
6 el,en-us
6 de,at,ch,deutsch,german
5 fr,french
5 en-us,en-gb
Note that many of those are just trying to cover all possible spellings of
one language.
This is just rough estimate - grepping would miss <meta> that was spread
across multiple lines, and I looked at individual declarations rather than
documents, so even few messy documents could distort such small sample,
and I haven't checked whether declarations match document content.
Compared to number of Content-Type pragmas in the same sample,
Content-Language seems popular enough to include in the spec.
However, declarations with more than one language are very rare and
usually contain invalid/redundant information. Based on this data I agree
with the spec.
--
regards, Kornel Lesiński
Received on Saturday, 3 April 2010 16:35:05 UTC