- From: आशीष शुक्ला \ <wahjava@gmail.com>
- Date: Fri, 2 Jun 2006 14:10:51 +0530
- To: "W3C HTML Mailing List" <www-html@w3.org>
Hi, On 6/2/06, David Woolley <david@djwhome.demon.co.uk> wrote: > You've failed to specify what you think the problem is, so I've > had to try and analyze from the thread you referenced. Thanks for reading the thread.Before I tell you what I mean, let's take an example: -- begin example -- You're navigating through a book collection, and in the English section, you came across a book, which is not in English, i.e. it is misplaced in the English section. So how do you interpret contents of the book ? You've three choices: 1. Assume it is English, whether you understand it or not. 2. Check its coverpage, may be author has mentioned the language of book. 3. Use your intelligence to guess language of book. So, according to me, I'll go for 2nd choice, so that if author has mentioned the language of book, I'll prefer that instead of assuming it as English language book, just because it is placed in English section. And this is not just for this special case (where a book is misplaced in the wrong section and somehow I detected it that it is misplaced), but also everytime I'll check its coverpage to see, if author has explicitly specified language of the book. -- end example -- So, the problem I encountered is similar to the above problem, where I'm hosting a website on a webserver, where I don't have any right to influence HTTP headers. So, webserver always send my UTF-8 HTML document as ISO-8859-1 document, i.e. in "Content-Type" HTTP header. As a author of the document, I've properly tagged my document, and followed guidelines (given in HTML specification) to specify character set used by my document. But a webserver, which doesn't have any autodetection support or is not able to detect document's encoding (probably document's encoding doesn't have any special markers in the header), sends document as default (ISO-8859-1) encoded document. And UA (user agent) instead of inspecting document's "Content-Type" <meta> tag (if there is any in document), where author might have placed proper character set information, follows web server's response (as specified in HTML 4.01 specification, which is incorrect in this case), and displays it improperly. So, as a document author, I've followed all guidelines, but as I don't have any control over webserver my document looks horrible, when served from webserver which are not able to detect my document's character set properly. So this means that a document author, should be a webmaster also. Specifying the character encoding (HTML 4.01 specification) http://www.w3.org/TR/html401/charset.html#idx-character_encoding-7 On the above URL, there is a priority list, followed by confirming UA in determining document's character encoding. This priority list needs to be modified, according to me. That's all I want to say. Sorry for my poor English grammar. Thanks for reading this mail. Ashish Shukla -- Ashish Shukla "Wah Java !!" आशीष शुक्ला ,= ,-_-. =. ((_/)o o(\_)) `-'(. .)`-' \_/ My blah, blah, blah at http://wahjava.blogspot.com/ My webpages at http://www.geocities.com/wah_java_dotnet/ My GPG Fingerprint: BBA9 AD7D BA71 61EB BE46 8CF5 E44A C663 A03F 4261 My GPG keys at http://keyserv.nic-se.se:11371/pks/lookup?op=get&search=0xA03F4261 -- All that looks C00L is not necessarily validable. -- Ashish Shukla "Wah Java !!"
Received on Friday, 2 June 2006 08:40:59 UTC