Re: 8 bit characters in DNS names (and URNs?)

Larry Masinter (masinter@parc.xerox.com)
Wed, 6 Mar 1996 21:12:56 PST


To: keld@dkuug.dk
Cc: martin@terena.nl, wg-i18n@terena.nl, uri@bunyip.com
In-Reply-To: Keld J|rn Simonsen's message of Tue, 5 Mar 1996 08:32:40 -0800 <199603051632.RAA27148@dkuug.dk>
Subject: Re: 8 bit characters in DNS names (and URNs?)
From: Larry Masinter <masinter@parc.xerox.com>
Message-Id: <96Mar6.211258pst.168963@nebula.parc.xerox.com>
Date: Wed, 6 Mar 1996 21:12:56 PST

While in ASCII you can define 'case independent match' by
performing 'translate to upper case and then use string equality',
this does not work for other character repertoires, e.g., JIS might
have separate codes for single and double-wide codes yet want to treat
them equivalent for matching.

While uppercase mapping is culturally sensitive, can we not make a
culturally independent 'character matching' algorithm that is good
enough for directory services. Perhaps it means treating accented and
unaccented versions of French initial capitals equivalent, even though
this equivalence is not determined by 'canonicalization'?