- From: Kornel Lesinski <kornel@geekhood.net>
- Date: Sat, 03 Apr 2010 17:34:20 +0100
- To: "Julian Reschke" <julian.reschke@gmx.de>
- Cc: "public-html@w3.org" <public-html@w3.org>
On Sat, 03 Apr 2010 10:00:32 +0100, Julian Reschke <julian.reschke@gmx.de> wrote: >> I agree that difference between http-equiv and HTTP is constant source >> of confusion. Authors mistakenly think it is equivalent of HTTP headers, >> and that most/all HTTP headers would work that way (e.g. there's lots of >> documents with HTTP cache directives in HTML). >> >> Obviously, it's not an HTTP header equivalent (unless HTTP will require >> HTTP clients to parse HTML) – the name is very misleading. > > Why would HTTP want to make requirements on HTML processing? Because without parsing of HTML at some point (by an HTTP server or proxies and clients) <meta> won't affect HTTP (e.g. content negotiation with Vary: Content-Language may cause invalid version to be cached if only HTML pragma is used), so it's not really an HTTP header equivalent, it's something else that only superficially looks like HTTP header. HTML5 defines http-equiv to contain specific values and HTTP-like pragmas registered in WHATWG registry under certain conditions, and not simply HTTP headers. > I think we already heard about these use cases. Just because *browsers* > do not support them doesn't mean that it's not used in other frameworks, > and there's really no reason to make those documents non-compliant. Could you point out such frameworks? How would they use such vague information in a useful way? I've grepped over 600000 documents from dotnetdotcom.org. Found 52835 content-language pragmas, and of these only 867 had a comma (same method finds 361666 content-type pragmas). These are most popular values (after normalization of case and whitespace): 100 nl,en 77 de,at,ch 39 fr,en 34 en-us,english 33 de,deutsch 26 fr,fr-be,fr-ca,fr-lu,fr-ch 26 en,us 26 de,ch,at 20 de,en 18 fr,fr-be,fr-ch,fr-lu,fr-mc 18 de,at 17 it,it-ch 16 nl,nl-be 15 el,en 14 deutsch,de 12 it,en,fr,de,es 12 en,th 12 de,de-ch,de-at,de-lu,de-li 11 es,es-es 10 pt,pt-pt 9 en-us,en-ca,en-au,en-bz,en-jm,en-nz,en-ph,en-tt 8 german,deutsch,allemand,de 8 en,us,fr,de,es,ca,nl,dk,it,pt,pl 8 en,en-us 7 en-us,en 6 zh,zh-hk,zh-cn,zh-sg,zh-tw 6 el,en-us 6 de,at,ch,deutsch,german 5 fr,french 5 en-us,en-gb Note that many of those are just trying to cover all possible spellings of one language. This is just rough estimate - grepping would miss <meta> that was spread across multiple lines, and I looked at individual declarations rather than documents, so even few messy documents could distort such small sample, and I haven't checked whether declarations match document content. Compared to number of Content-Type pragmas in the same sample, Content-Language seems popular enough to include in the spec. However, declarations with more than one language are very rare and usually contain invalid/redundant information. Based on this data I agree with the spec. -- regards, Kornel Lesiński
Received on Saturday, 3 April 2010 16:35:05 UTC