- From: Najib Tounsi <ntounsi@emi.ac.ma>
- Date: Thu, 25 Oct 2007 23:22:07 +0000
- To: Martin Duerst <duerst@it.aoyama.ac.jp>
- CC: Daniel Dardailler <danield@w3.org>, 'WWW International' <www-international@w3.org>, W3C Offices <w3c-office-pr@w3.org>, public-i18n-core@w3.org
- Message-ID: <4721251F.3030109@emi.ac.ma>
Hello Martin, Few more comments. Martin Duerst wrote: > ... Scaling issues are definitely difficult to predict, but for that, > most kinds of testing doesn't help. And adding IDN TLDs shouldn't be > done in a way that suddenly increases the number of TLDs by an order > of magnitude. Adding just the IDN ccTLDs in what I call the first > stage below will probably less than double the overall number of > ccTLDs. +1. for this by-steps approach. > ... So I think these ccTLDs should be staged, first creating those > for scripts that are widely used in a particular country (as > criteria, things such as "does that script appear on the country's > coins or banknotes", "is that script used in official publications" > and so on can be used). The second stage may include minorities (e.g. > Arabic for France, Punjabi for the UK, maybe Tamil for Switzerland > and so on). The third stage would address tourists, and the fourth > stage, if ever, could try to reach full coverage. If I understand you well, for France as an example, TLDs would be a small set like: {.fr (of course), .FRANSA (France name written in Arabic script), .AlphaBetaGamma (in Greek script) ...}, all equivalent to the original .fr . And for Morocco {.ma (French is de-facto 2nd official language), .ALMAGHREB (Morocco name written in Arabic script), .XYZ (in Tifinagh), .BetaGammaAlpha (for Greek) ...}. Me, as an Arabic speaker, I'll be "happy" to type .FRANCA in Arabic for my preferred YAHOO.FRANCA site. Luckily, my keyboard is bilingual. But note that there is a sub-problem here. May be Morocco will have the advantage :-) to gain the full name (vs. a short code) .ALMAGHREB as it's brand new TLD, where as France will keep their .fr , and might ask for the same advantage to create a .france TLD. > ... Please note that above, we are always speaking about script, not > language. BTW, I've noted that ICANN actually tests Arabic + Persian TLDs. Both are languages based on the same Arabic script. One test is redundant? > ... > I have read John's various documents on this topic, have listened to > talks from him, and have discussed things directly with him. He > raises a few good points, but many times makes an elephant out of a > mouse, or lets readers things about elephants when they should think > about mice. I didn't know about this author before, and I just read some of his writings about IDNs. If I understand his approach, he suggests to deal with IDNs at the level of user's Interface. The idea is to map TLDs locally, that is, if a TLD is in local characters, it is translated to the standard ASCII form, using a translation table where names in local form are kept with their corresponding ascii form .fr, .uk, .com etc. I remember in a recent mail by Sarmad Hussain (see the thread http://lists.w3.org/Archives/Public/public-i18n-core/2007AprJun/0181.html) that this approach is also followed locally for Urdu language. From the above thread, it appears that this not recommended. There is at least an interoperability problem. You type your native IDN in your computer, your local browser can map it because it has been set locally to do so (e.g. by a plug-ins). You copy-paste and send me this same IDN. My browser can't recognize it. > ... Well, licence plates are a good example actually. In Japan, they > use Kanji and Hiragana. In Germany, they use Umlauts. In many Arabic > countries, they use Arabic letters and numerals. Najib can tell us > what Marocco does. Well, years ago, there were numbers like "1363-1 4" as showed in http://www.worldlicenseplates.com/ Now, Arabic letters are used, like " 12345-X 9", where X is actually the Arabic Beh 'ب', coming after Alef 'أ'. For the story, this didn't make everybody happy. At vehicle administration and insurance companies, computer applications are not yet ready to accept non Latin scripts. I've been told that, as a temporary solution, Alef is replaced by A and Beh by B. > ... There is also nothing prohibiting somebody from serving a domain > both with one (or several) IDNs as well as with a US-ASCII-only > domain name. > I guess that this will be a very used practice. > > See above. The usability benefits for the local population, including > the local police, come first. If 99.9% of the potential users can > remember or note down a car licence number faster because it uses the > native script, but 0.1% of potential users don't manage to remember > it or note it down, then that's a net average gain. I completely agree with this kind of opinion, optimistic in nature. However, in I18N, it is not always symetric to talk about countries. With globalization, I don't know if the "99.9% vs. 0.1 %" rule will still hold. It is well established that in developing countries, to speak an occidental (and thus a Latin) language is an asset. But this is another question. Regards, Najib
Received on Thursday, 25 October 2007 23:22:36 UTC