- From: Sander Tekelenburg <st@isoc.nl>
- Date: Wed, 18 Jul 2007 18:56:28 +0200
- To: public-html@w3.org
At 08:20 +0300 UTC, on 2007-07-18, Dmitry Turin wrote: [<http://html60.chat.ru/site/html60/ru/index_ru.htm>] > ST> Is there any particular reason why you're relying on UAs to guess what > ST> character repertoire the document is in? [...] > RB> Servers rarely include a charset > RB> header and that might be a good thing, because those would likely be > RB> often wrong too. The default server config should indeed not claim a character repertoire. But the author should configure one. > AF> It is an author's error to publish document without > AF> providing information of what encoding is used in it. > > Guessing is not in deal. Purpose is to give possibility to user > to change encoding manually in browser menu and follow along anchors. AFAIK every browser allows the user to change the character repertoire anyway, always. Even if it is claimed by the server/document. (I don't know if that's just wisdom, or spec-required). But then still, if you serve your documents with the correct charset info, users don't *need* to change anything. The UA will apply the claimed character repertoire. > Let's enter terms: > 'falling of encoding', which means, that browser show document as > writed in other encoding, than document is; > 'anchor falling', which means, that 'falling in encoding' occurs in new >document, > after user has followed along <a href> in previous document. If the page pointed to by the anchor is served with the proper charset info, I don't see why it should fail -- even if it uses a different character repertoire than the previous page. > I met three case with anchor falling: > (1) at serfing in documents on server > (1.1) new document does not contain frames, i.e. is a single document > (1.2) anchor falling occurs in frame Documents that are loaded in an iframe have their own http headers too, so I don't see why iframes would be a special case. I'll grant you I haven't experimented serving mixed character repertoires though -- a main document with one charset, and one with a different charset embedded in an iframe. Do UAs get that wrong? > (2) at serfing in documents on local file system > after downloading of site - > anchor falling occurs, because <meta content="text/html; charset="> and > real encoding differ each other. If you use meta http-equiv to provide the charset, you must (of course) ensure that it is the exact same as the http header (or, in HTML5, that the http header claims no charset). The only situation in which I can imagine you'd set a meta charset that is different from the http charset is when [1] the server is misconfigured to serve some default charset value and [2] you cannot change that. But in that case you should simply change to a better server. Btw, for shared hosts, people seem to simply assume that they cannot generate proper HTTP headers. But for instance Apache allows each user to configure their own area of the server. So unless the admin crippled that, you can generate a proper HTTP Content-Type header through .htaccess. If that's crippled, and you have something like PHP available, you can use that to generate proper HTTP headers. [...] > What's about guessing algorithm to improve today's browsers, HTML5 already defines that algorithm: <http://www.whatwg.org/specs/web-apps/current-work/multipage/section-parsing.html#determining0> But that's error recovery -- no reason for authors to rely on that. No matter how well the algorithm is thought out, and even assuming all UAs implement it flawlessly (unlikely), you're still relying on the UA to understand what you mean and discard what you say. Even if an author understands the algorithm well enough to rely on it, doing so excludes pre-HTML5 UAs from presenting the document reliably. > maybe there is reason to borrow it from russian text editors, > which auto-detect encoding. Obviously Opera and IE already do ;) (Or well, at least in this case what they happen to do what you hoped for.) As for text editors, BBEdit does and I'd expect many others do too. I don't know if 'WYSIWYG' editors like GoLive, Dreamweaver, Freeway, etc. do Nvu probably does? -- Sander Tekelenburg The Web Repair Initiative: <http://webrepair.org/>
Received on Wednesday, 18 July 2007 17:06:47 UTC