- From: Frank Ellermann <nobody@xyzzy.claranet.de>
- Date: Sat, 26 Apr 2008 16:08:54 +0200
- To: www-international@w3.org
Erik van der Poel wrote: > Do you know of any user agents that process the IRI > differently, depending on the XHTML 1 claim? No, but for some months now I use only two browsers, both belonging to the "popular" class. After the IE7-XP confusion in 2007 one validator catches broken URIs (including raw IRIs), for validator.w3.org it is still only a reported bug. AFAIK - I never did more than participate in public beta-tests of this validator, maybe it's fixed, and then I'd be curious how (with a DTD based validator). >> And "raw" UTF-8 IRIs are boring, popular browsers >> get this right - "raw" IRIs in legacy charsets are >> more interesting. > I agree that those are more interesting. The major > browsers are slowly converging on a set of conventions > in this area. FF2 hated the simple <ipath> in this case (covered by Martin's test suite), but got a KOI8-R <ihost> right. I didn't test BiDi scripts, can't read them and would miss obvious bugs, besides the IDNAbis folks are about to fix various issues wrt BiDi. > Host name: Content developers still use Punycode > because MSIE 6 does not support IDNA. There is a plugin for IE6, link offered on ICANN's IDN Wiki, I did not test it so far. Actually it would be strange to use "raw" IRIs when this (1) is invalid for relevant (X)HTML versions, and therefore by definition missing the entry condition for many accesibility tests, (2) it really isn't accessible with older browsers, IE6 is by far not the oldest browser, (3) IRIs are designed to have an equivalent URI, (4) IRI producers can handle raw IRIs, therefore they can as well "URIfy" them for URI consumers not supporting "raw" IRIs, and (5) nobody bothered to specify "XHTML I18n", it should be trivial. For old software, some browsers, limited devices, curl, wget, whatever, it will take years until this all ended up in a museum, today it's still obscure to expect that URI consumers - the weaker part - support raw IRIs with their Unicode 3.2 IDN punycode obsctacles for <ihost>. Anything else in "raw" IRIs is straight forward, even in legacy charsets, but the <ihost> part isn't trivial. > Path: Firefox has agreed to convert raw paths to > escaped UTF-8, starting with Firefox 3. It should, getting the <ihost> right (in FF2), but not <ipath> (for legacy charsets), must be a bug. A bit like "can do integrals, but can't do sums"... :-) Frank
Received on Saturday, 26 April 2008 14:07:05 UTC