On Jun 1, 2009, at 11:08, Jonathan Rosenne wrote: > Not only CJK and Cyrillic, also Hebrew and I had thought that existing Hebrew content largely didn't have the problem of lacking encoding labels. (Isn't even the most legacy Visual Hebrew content generally *encoding*-labeled even if not *direction*- labeled?) I observe that existing heuristic detectors don't tend to support Hebrew encodings. This suggests that either content is generally labeled or there's one dominant encoding (which one? Windows-1255?), since developing heuristic detection wasn't necessary to break into the Hebrew browsing market. How bad is breakage if a non-Hebrew encoding default is in effect and the user browses the Hebrew part of the Web? > I suppose many other non-Latin languages. There are also Latin non-Windows-1252 encodings, but it doesn't automatically follow that there's a serious legacy of unlabeled content in every legacy encoding. (Serious meaning: Users would reject a browser that didn't allow them to set a locale-specific last-resort encoding or that didn't tie a locale-specific last-resort encoding to the UI language.) -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/Received on Monday, 1 June 2009 08:54:35 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:19 GMT