- From: Sander Tekelenburg <st@isoc.nl>
- Date: Fri, 13 Jul 2007 18:22:20 +0200
- To: public-html@w3.org
At 08:19 +0300 UTC, on 2007-07-13, Dmitry Turin wrote: > Good day, Robert. > > RB> I was wondering what character encoding you use to serve up this page: > RB> <http://html60.chat.ru/site/html60/ru/index_ru.htm> > RB> We're trying to conduct some tests on current UAs and this page might > RB> be helpful. Do you know what charset it uses? > > All pages in russian language are coded in WIN-1251. > These documents are displayed truely both in IE and Opera. Only because they happen to guess what you intend. They're not presented as you intend in iCab3.0.3, Firefox2.0.0.4, Safari2.0.4 (because neither the server nor the document itself say what character repertoire the document is in). Is there any particular reason why you're relying on UAs to guess what character repertoire the document is in? (I believe HTML5 aims to define a perfect guessing algorithm, but AFAIK the idea is 'just' to unify UA behaviour. I don't believe the intention is that authors rely on that -- they're still expected to provide the proper Content-Type header, or a <meta charset="value">: <http://www.whatwg.org/specs/web-apps/current-work/multipage/section-document.html#charset0>) Now I'm aware that apparently there is some practical problem with authoring cyrillic, in that 4 or 5 different encodings are commonly used. Russian Apache deals with that through content-negotiation: <http://apache.lexa.ru/english/>. But I see no reason for authors to rely on UAs to just magically guess the correct character repertoire. Or is there? -- Sander Tekelenburg The Web Repair Initiative: <http://webrepair.org/>
Received on Friday, 13 July 2007 16:46:46 UTC