Re: 8 bit characters in DNS names (and URNs?)

Keld J|rn Simonsen (keld@dkuug.dk)
Tue, 5 Mar 1996 17:32:40 +0100


Message-Id: <199603051632.RAA27148@dkuug.dk>
From: keld@dkuug.dk (Keld J|rn Simonsen)
Date: Tue, 5 Mar 1996 17:32:40 +0100
In-Reply-To: martin@terena.nl (John Martin)
To: martin@terena.nl (John Martin), wg-i18n@terena.nl
Subject: Re: 8 bit characters in DNS names (and URNs?)
Cc: uri@bunyip.com

Alexander Dupuy writes a note on 8-bit DNS entries.
He states that the biggest problem is that DNS entries are
case insensitive, and that this is not well defined beyond ASCII.

I am the editor of an ISO standard where we are defining
a format for cultural conventions building on the POSIX
locales and charmaps. Included will be a standard locale with
mapping tables between lower and upper case for the whole of 10646. 
This locale will be freely available on the net together with
charmaps more than 100 coded character sets. Data is already
available that is similar to this, but not complete yet over full 10646.

Alexander also writes that the upercase mapping is culturally sensitive.
This is correct, but there is a great majority of cultures 
that have the same toupper() specifications. In most cultures a
latin small e with acute is capitalized into a capital e with acute.
Likewise with a small greek omega - it is capitalized into a capital
greek omega. The only exception I can think of is in Turkish
<i without dot> with uppercase <I>, and <i> capitalized into <I with dot>.
Then some say that in french they never use capitalized accented letters,
but that seems not to be the rule, according to official French sources.

I am confident that the uppercase mapping should not be a problem.
But I am not sure that we should do this just as an enhancement 
in DNS. Anyway one way to do it would be to say that 
the entry should be in UTF-8, and we could define a new RR type  to
do this. URLs could then first look there and if not found look
in the normal RRs. I am not sure it is the right time to make
such specifications, though.

Keld