- From: Jeremy Carroll <jjc@hpl.hp.com>
- Date: Tue, 31 Jul 2007 12:29:06 +0100
- To: Richard Ishida <ishida@w3.org>
- CC: www-international@w3.org, public-iri@w3.org, "'Sarmad Hussain'" <sarmad.hussain@nu.edu.pk>
In principle, I'm strongly opposed to 'plug-ins' for IRI processing (and hence IDN processing). However, I can see the argument for TLDs more easily than for Characters === A plug-in approach is likely to address one or two leading Web browsers, but would be very unlikely to also address less core applications that also use IRIs (e.g. any Semantic Web software). I do not wish to be close-minded, and if a plug-in is an appropriate medium-term measure then so-be-it, but alarm bells start ringing when this is billed as a language specific process, rather than one that generalises .... === On TLDs - I think this looks like a legitimate user-need, and any solution involves a look-up table that maps non-English TLDs to English ones, e.g. from column C to A on the spreadsheet, but potentially multiplied many fold. In some sense, the English one ends up as the canonical form. The sort of application-end solution with which I would be happiest, is one for which it is easy for me to support in my software too. e.g. A) There is a specified Web site which has the mappings, which can be accessed both one at a time, and downloaded all at once. Ideally there should be some process for adding new mappings and new languages. B) The 'plug-in' would then have its own copy of the mapping table which would be refreshed from the Web from time to time, and do a very simple replacement on the TLD. C) This is easy to code up so that other pieces of software that wish to support these mappings can. To be generally usable an RFC or similar would be needed. I would have a strong preference for this to be seen as part of the ToASCII operation in IDN processing (e.g. strip off TLD, and if it is in the lookup-table, replace it); so that it is clear that it is done as late-as-possible, during retrieval, etc. etc. The canonical form is then the one used during retrieval - e.g. the URL, which for better or worse is ASCII. Jeremy -- Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN Registered No: 690597 England
Received on Tuesday, 31 July 2007 11:29:42 UTC