Re: 8 bit characters in DNS names (and URNs?)

At 9:44 08.03.1996, Masataka Ohta wrote:
>> JIS might
>> have separate codes for single and double-wide codes yet want to treat
>> them equivalent for matching.
>JIS does not.
>> While uppercase mapping is culturally sensitive, can we not make a
>> culturally independent 'character matching' algorithm that is good
>> enough for directory services.
>
>Theoretically, it is a union of all the matching rules of all
>the culture. But, in practice, it is hard especially because
>the expected degree of matching differs service by service.
>                                               Masataka Ohta

German has a lower case letter
(looks like a beta -  /tell your software to read next line latin-1 quoted
printable/
ß
Swiss German doesn't use it).
Equivalent to ss, capital SS (*two* letters).
Also, the canonical conversion of the
umlauts (vowel + two dots above)
ä   is ae
ö   is oe
ü   is ue
capitalised AE OE UE
(historically the two dots were originally an e written above).

You would never write umlaut A as an A. (only aliens do so - and software).

The back transformation is not unique!

German matching software handles this (as far as possible).





Peter Paul Sint    (sint@oeaw.ac.at, http://www.soe.oeaw.ac.at/~sint/)
Research Unit for Socio-Economics, Austrian Academy of Sciences
Kegelgasse 27, A-1030 Wien (=Vienna), Austria.
Phone:(+431) 712 21 40 - 36   Fax: (+431) 712 21 40 - 34

Received on Friday, 8 March 1996 20:34:34 UTC