RE: Feedback: Using non-ASCII characters in Web addresses from Richard Ishida on 2004-09-21 (public-i18n-geo@w3.org from September 2004)

From: Richard Ishida <ishida@w3.org>
Date: Tue, 21 Sep 2004 19:00:15 +0100
To: "'Deborah Cawkwell'" <deborah.cawkwell@bbc.co.uk>, <public-i18n-geo@w3.org>
Message-Id: <20040921180015.77D294F26B@homer.w3.org>
Hi Deborah,

Thanks for this. See inline...


============
Richard Ishida
W3C

contact info:
http://www.w3.org/People/Ishida/ 

W3C Internationalization:
http://www.w3.org/International/ 

Publication blog:
http://people.w3.org/rishida/blog/
 
 

> -----Original Message-----
> From: public-i18n-geo-request@w3.org 
> [mailto:public-i18n-geo-request@w3.org] On Behalf Of Deborah Cawkwell
> Sent: 01 September 2004 14:47
> To: public-i18n-geo@w3.org
> Subject: Feedback: Using non-ASCII characters in Web addresses
> 
> USING NON-ASCII CHARACTERS IN WEB ADDRESSES/AN INTRODUCTION 
> TO MULTILINGUAL WEB ADDRESSES (ROUGH DRAFT !) 
> http://www.w3.org/International/articles/idn-and-iri/
> ------------------
> (First) 'Step by step example' & 'Overview' sections 
> duplicate a bit. What I want to know is:

I rewrote these. Should be better.


> 1) why
> 2) how it works technically
> 3) relationship to URI
> 4) does it work at all points, eg, UA, domain reg, etc
> ------------------
> Could be stronger & more direct suggestion to register two names:
> "In practise, it would make sense to register two names for 
> your domain. One in your native script, and one using just 
> the regular Latin characters. The latter will be more 
> memorable and easier to type for people who do not read and 
> write your language. For example, as a minimum, you could 
> additionally register a transcription of the Japanese in 
> Latin script, such as the following:"

Not sure how to make that more direct.


> ------------------
> .jp is lower case to start with.
> "Note how the ASCII characters 'JP' are lowercased, but 
> otherwise just passed through ."

This was referring to the JP at the beginning of the domain name. I made that clearer.


> ------------------
> Which version of IE? 
> IE 5.0, 5.5, & 6.0 according to download page 

Added 


> (http://www.idnnow.com/index.jsp) "The conversion process was 
> already supported natively in Mozilla 1.4 / Netscape 7.1, and 
> Opera 7.2. It works in Internet Explorer if you download a 
> plug-in (for example, this one)."
> Worked for me with IE 6.0
> ------------------
> I think 'Additional problems' section would sit better in a 
> technical how-it-works section, saying that by escaping 
> non-ASCII characters can be represented without IRIs, but 
> that this is dependent on the encoding in the file system, 
> ie, in the example case, Shift-JIS or UTF-8.
> The first line of the current 'Additional problems' section, ie:
> "An IRI is defined as a sequence of characters, not bytes - 
> so the fact that the IRI might be represented in documents or 
> protocols using different encodings is irrelevant."
> Does not go to the heart of one problem; it is the reason why 
> the escape solution can be a problem. The additional problem 
> being human readability and memorability. But I think it's 
> useful to include the statement that a URI & IRI is 
> represented as a sequence of characters, not as a sequence of octets. 
> What is the relationship between URI & IRI?


I hope this is all clearer in the new version. I moved things around a bit.

Cheers,
RI
> ------------------
> 
> http://www.bbc.co.uk/ - World Wide Wonderland
> 
> This e-mail (and any attachments) is confidential and may 
> contain personal views which are not the views of the BBC 
> unless specifically stated.
> If you have received it in error, please delete it from your system. 
> Do not use, copy or disclose the information in any way nor 
> act in reliance on it and notify the sender immediately. 
> Please note that the BBC monitors e-mails sent or received. 
> Further communication will signify your consent to this.
>
Received on Tuesday, 21 September 2004 18:00:16 UTC