RE: Feedback: Using non-ASCII characters in Web addresses

Hi Najib,

See notes below...

> -----Original Message-----
> From: public-i18n-geo-request@w3.org 
> [mailto:public-i18n-geo-request@w3.org] On Behalf Of Najib Tounsi
> Sent: 09 September 2004 14:52
> To: Deborah Cawkwell
> Cc: public-i18n-geo@w3.org
> Subject: Re: Feedback: Using non-ASCII characters in Web addresses
> 
> 
> Just about ASCII CHARACTERS.
> 
> It may be worth to specify what do the expressions "ASCII CHARACTERS" 
> and "NON-ASCII CHARACTERS" cover?

The use of ASCII is a little loose here, since as mentioned in the beginning of the article, there are slightly different specifications for appropriate character sets in URIs and Domain Name usage.


> With the usage, ASCII may refer to the US-ASCII CHARACTER 
> [00..7F] (only
> 7bits) or the PC-ASCII-CHARACTER (accent extension (the whole 
> 8 bits)).

This is usually referred to as ANSI, rather than ASCII.  ASCII is a term used to refer to a 7-bit encoding.  ANSI is the 8-bit encoding that includes accented characters.  Another name for the 8-bit ISO encoding ISO-8859-1 is Latin1.
 
> I speak as a french language and thus an AZERTY keyboard user.
> Example:
> é (é) is coded
> -   'E9' in western-ISO-8859-1 (PC-ASCII extension)
> -   'C3 A9' in UTF-8
> Which one is NON-ASCII  'E9', 'C3 A9' or both ?

Both.

hth
RI

Received on Tuesday, 21 September 2004 17:50:40 UTC