IDN-support for the validate by URI (IRI) feature

From: Martin Janecke <w3.org@kaor.in>
Date: Tue, 24 Aug 2010 00:00:51 +0200
Message-ID: <4C72EF93.2000004@kaor.in>
To: www-validator@w3.org
Can you please make the http://validator.w3.org/ accept IDNs in its 
validate by ~URI~ field? ~URI~ should be replaced by IRI then, I assume.

--- Previous discussions about this issue ---

It seems IDN-support has been proposed already, e.g. in 2007:

However, it hasn't been implemented since then. Can you provide 
information on why it hasn't been implemented in case it was seriously 
considered but failed for a good reason?

--- Why should the validator accept IDNs? ---

As a user of an IDN for my personal website I constantly experience 
problems because up-to-data software is not able to handle them 
correctly. E.g. the current Thunderbird version can't handle IDNs in 
mail addresses at all, the current Firefox version handles IDNs or their 
corresponding ACE-strings incorrectly in some cases (e.g. surprisingly 
fails at understanding ACE-Strings in Feed-URLs in link-tags).

This is frustrating. It makes IDNs almost unusable for anything serious 
yet. I guess three reasons for this situation, among others and years 
after IDNs became available, are

1. an opaque encoding mechanism (the use of the particular generalized 
variable-length integer represantation isn't exactly intuitive),
2. the fact that IDNs aren't valid values in HTML4 href-attributes yet, 
but yay for HTML5 if I understand the current draft correctly,
3. the fact that leading software vendors come from an environment where 
IDNs aren't of much interest, as ASCII covers almost all English words 
and names.

I am convinced that full support for IDNs by W3C tools like this great 
validator would

a. be an important sign that IDN support counts (and implied: other 
languages than English, other cultures count),
b. make it easier for IDN owners to use the tool,
c. be consequent as the validator also validates HTML5 (experimental 
feature) where IDNs are allowed in href-attributes.

Martin Janecke
Received on Monday, 23 August 2010 22:02:06 UTC

