- From: Martin Duerst <duerst@it.aoyama.ac.jp>
- Date: Fri, 26 Oct 2007 15:21:18 +0900
- To: Najib Tounsi <ntounsi@emi.ac.ma>
- Cc: Daniel Dardailler <danield@w3.org>, "'WWW International'" <www-international@w3.org>, W3C Offices <w3c-office-pr@w3.org>, public-i18n-core@w3.org
At 08:22 07/10/26, Najib Tounsi wrote: >Hello Martin, > >Few more comments. > >Martin Duerst wrote: >> ... Scaling issues are definitely difficult to predict, but for that, >>〓 most kinds of testing doesn't help. And adding IDN TLDs shouldn't be >> done in a way that suddenly increases the number of TLDs by an order >> of magnitude. Adding just the IDN ccTLDs in what I call the first >> stage below will probably less than double the overall number of >> ccTLDs. > >+1. for this by-steps approach. > >> ... So I think these ccTLDs should be staged, first creating those >> for scripts that are widely used in a particular country (as >> criteria, things such as "does that script appear on the country's >> coins or banknotes", "is that script used in official publications" >> and so on can be used). The second stage may include minorities (e.g. >> Arabic for France, Punjabi for the UK, maybe Tamil for Switzerland >> and so on). The third stage would address tourists, and the fourth >> stage, if ever, could try to reach full coverage. > >If I understand you well, for France as an example, TLDs would be a small set like: {.fr (of course), .FRANSA (France name written in Arabic script), .AlphaBetaGamma (in Greek script) ...}, all equivalent to the original .fr. In the first stage, only .fr would be needed. In the second stage, an Arabic script equivalent could be added. I don't know how many Greek there are in France, so I don't know which stage to put Greek in. As for 'equivalent', I think it's up to the internet community and authorities in each country to decide exactly how to handle that. One solution is to register and serve exactly the same second-level domains in all country ccTLDs, another is to strictly separate them by script, and so on. Also, I would clearly advise for the Arabic equivalent also be a short code, rather than a full word. >And for Morocco {.ma (French is de-facto 2nd official language), .ALMAGHREB (Morocco name written in Arabic script), .XYZ (in Tifinagh), .BetaGammaAlpha (for Greek) ...}. Yes. There is no need for a justification for .ma, it already exists, and of course shouldn't disappear. Arabic clearly belongs in the first stage. Again, I think a two-letter abbreviation would be better. There is already a standard (ASMO?) for two-letter country codes for Arabic covering the countries using the Arabic script. Tifinagh probably also belongs into the first stage, or maybe the second. The issue here is to not be too strict about the stages, it's more a question of how quickly reasonable proposals can be made. It may be easier for the few contries where Tifinagh is actually used to come up with a proposal for ccTLDs for each, whereas Arabic may take more time, or it may be that Arabic can use the abovementioned standard and be faster. As for Greek, I have no idea how many people actually living in Morocco use Greek, my guess is that this would be third or fourth stage. >Me, as an Arabic speaker, I'll be "happy" to type .FRANCA in Arabic for my preferred YAHOO.FRANCA site. Luckily, my keyboard is bilingual. >But note that there is a sub-problem here. May be Morocco will have the advantage :-) to gain the full name〓 (vs. a short code) .ALMAGHREB as it's brand new TLD, where as France will keep their .fr , and might ask for the same〓 advantage to create a .france TLD. That's one of the reasons (but not the only one) why I advocate to use short codes (mostly two-letter, in the case of Han ideographs, one letter would be enough, and probably also one syllable for Hangul and potentially for Ethiopic). >> ... Please note that above, we are always speaking about script, not >> language. > >BTW, I've noted that ICANN actually tests Arabic + Persian TLDs. Both are languages based on the same Arabic script. One test is redundant? I don't think these are redundant. First, please note that somebody was careful enough to make sure that the top-level domain actually is the same in both cases! Second, I think one of the ideas is to use a Wiki at these domains, in which case the language starts to become important. Third, it's much better to do tests in parallel than to later have somebody claim that there haven't been any tests in Persian yet,... >> ... >> I have read John's various documents on this topic, have listened to >> talks from him, and have discussed things directly with him. He >> raises a few good points, but many times makes an elephant out of a >> mouse, or lets readers things about elephants when they should think >> about mice. > >I didn't know about this author before, and I just read some of his writings about IDNs. If I understand his approach, he suggests to deal with IDNs at the level of user's Interface. The idea is to map TLDs locally, that is, if a TLD is in local characters, it is translated to the standard ASCII form, using a translation table where names in local form are kept with their corresponding〓 ascii form .fr, .uk, .com etc. This is one of the proposals that John Klensin has made, but by far not the only one. I have made a similar proposal for scheme names, so I can't claim that his proposal is totally without merrit, but I think it's not appropriate for TLDs. >I remember in a recent mail by Sarmad Hussain (see the thread <http://lists.w3.org/Archives/Public/public-i18n-core/2007AprJun/0181.html>http://lists.w3.org/Archives/Public/public-i18n-core/2007AprJun/0181.html) that this approach is also followed locally for Urdu language. From the above thread, it appears that this not recommended. There is at least an interoperability problem. You type your native IDN in your computer, your local browser can map it because it has been set locally to do so (e.g. by a plug-ins). You copy-paste and send me this same IDN. My browser can't recognize it. Yes. If you make it an interface issue, you have to keep it strictly locally, which is difficult. >I completely agree with this kind of opinion, optimistic in nature. >However, in I18N, it is not always symetric to talk about countries. With globalization, I don't know if the "99.9% vs.〓 0.1 %" rule will still hold. It is well established that in developing countries, to speak an occidental (and thus a Latin) language is an asset. But this is another question. Of course we don't want to prohibit the use of Latin script. And speaking more than one language is definitely an asset (actually, in many, many countries and regions around the world, speaking two or more languages, whether two local languages or a local language and an 'occidental' language,..., is the norm rather than the exception). Regards, Martin. #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
Received on Friday, 26 October 2007 08:41:01 UTC