W3C home > Mailing lists > Public > www-validator@w3.org > October 2007

Re: IRIs in href (Was: Notes on validome test suite / validators comparison)

From: olivier Thereaux <ot@w3.org>
Date: Tue, 30 Oct 2007 16:33:07 -0400
Message-Id: <7C8D31D5-6394-4626-A143-9070C3182CC6@w3.org>
Cc: www-validator@w3.org
To: Frank Ellermann <nobody@xyzzy.claranet.de>

On Oct 25, 2007, at 03:39 , Frank Ellermann wrote:
> Users want that something happens when they
> click on a link, without upgrading their browser.  And native IRIs
> are designed to have an equivalent URI-form.

Lack of support for IRIs in legacy user agents is an issue, understood.
Now, if today the HTML 4.01 and XHTML 1.0 specs and above were  
updated to say "IRIs" instead of "URIs", what would you do?

As I wrote before, these specs were written before IRIs were a reality.
The HTML4 spec contains advice on how to treat "URIs containing non- 
ASCII characters".
See   http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.1
Although it clearly calls these illegal, it prepares the ground for  
IRIs (for which we didn't yet have that name at that time).

Saying that IRIs should not be used because they break in legacy  
software, is an argument I have sympathy for, but have trouble  
accepting. This reminds me of the situation whereby, in Japan, one  
still can't safely use unicode in mails, because so many MUAs or  
webmails just don't support it.

> Sooner or later validators will be fixed to validate URIs, what with
> all those "URI exploits" we've seen in the last weeks for XP after
> the installation of IE7.

This is irrelevant to the discussion about IRIs. Please don't use  
internationalization as a scapegoat for bad coding.

> I can still tell you the day when the W3C validator started to flag
> &#128; as invalid on a windows-1252 page. I was working on this
> page, it was stunning.

There once was a bug, and IIRC it was fixed in a few hours. Now, how  
is that relevant to the discussion at hand?

> I'm curious which expert propagates to violate specifications.  Want
> to know how long it took me to create an XHTML ersatz-DTD permitting
> IRIs everywhere ? 30 minutes.

Here you must be joking, bluffing, or mistaken, Frank. The current  
XHTML DTD says that DTDs are CDATA, and thus any SGML or XML  
validator has to accept all the characters allowed in the document,  
which includes all those usable in IRIs.

Received on Tuesday, 30 October 2007 20:33:11 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:17:54 UTC