- From: Peter Paul Sint <sint@oeaw.ac.at>
- Date: Sat, 9 Mar 1996 02:34:10 +0100
- To: masinter@parc.xerox.com (Larry Masinter)
- Cc: keld@dkuug.dk, martin@terena.nl, wg-i18n@terena.nl, uri@bunyip.com
At 9:44 08.03.1996, Masataka Ohta wrote: >> JIS might >> have separate codes for single and double-wide codes yet want to treat >> them equivalent for matching. >JIS does not. >> While uppercase mapping is culturally sensitive, can we not make a >> culturally independent 'character matching' algorithm that is good >> enough for directory services. > >Theoretically, it is a union of all the matching rules of all >the culture. But, in practice, it is hard especially because >the expected degree of matching differs service by service. > Masataka Ohta German has a lower case letter (looks like a beta - /tell your software to read next line latin-1 quoted printable/ ß Swiss German doesn't use it). Equivalent to ss, capital SS (*two* letters). Also, the canonical conversion of the umlauts (vowel + two dots above) ä is ae ö is oe ü is ue capitalised AE OE UE (historically the two dots were originally an e written above). You would never write umlaut A as an A. (only aliens do so - and software). The back transformation is not unique! German matching software handles this (as far as possible). Peter Paul Sint (sint@oeaw.ac.at, http://www.soe.oeaw.ac.at/~sint/) Research Unit for Socio-Economics, Austrian Academy of Sciences Kegelgasse 27, A-1030 Wien (=Vienna), Austria. Phone:(+431) 712 21 40 - 36 Fax: (+431) 712 21 40 - 34
Received on Friday, 8 March 1996 20:34:34 UTC