Re: 8 bit characters in DNS names (and URNs?)

Peter Paul Sint (
Sat, 9 Mar 1996 02:34:10 +0100

Message-Id: <v02130505ad668ac95a8d@[]>
Date: Sat, 9 Mar 1996 02:34:10 +0100
To: (Larry Masinter)
From: (Peter Paul Sint)
Subject: Re: 8 bit characters in DNS names (and URNs?)

At 9:44 08.03.1996, Masataka Ohta wrote:
>> JIS might
>> have separate codes for single and double-wide codes yet want to treat
>> them equivalent for matching.
>JIS does not.
>> While uppercase mapping is culturally sensitive, can we not make a
>> culturally independent 'character matching' algorithm that is good
>> enough for directory services.
>Theoretically, it is a union of all the matching rules of all
>the culture. But, in practice, it is hard especially because
>the expected degree of matching differs service by service.
>                                               Masataka Ohta

German has a lower case letter
(looks like a beta -  /tell your software to read next line latin-1 quoted
Swiss German doesn't use it).
Equivalent to ss, capital SS (*two* letters).
Also, the canonical conversion of the
umlauts (vowel + two dots above)
=E4   is ae
=F6   is oe
=FC   is ue
capitalised AE OE UE
(historically the two dots were originally an e written above).

You would never write umlaut A as an A. (only aliens do so - and software).

The back transformation is not unique!

German matching software handles this (as far as possible).

Peter Paul Sint    (,
Research Unit for Socio-Economics, Austrian Academy of Sciences
Kegelgasse 27, A-1030 Wien (=3DVienna), Austria.
Phone:(+431) 712 21 40 - 36   Fax: (+431) 712 21 40 - 34