Date: Fri, 2 May 1997 17:58:31 +0200 (MET DST) From: "Martin J. Duerst" <firstname.lastname@example.org> To: Larry Masinter <email@example.com> cc: URI mailing list <firstname.lastname@example.org> Subject: Re: "Difficult Characters" draft In-Reply-To: <3369AC9E.281F@parc.xerox.com> Message-ID: <Pine.SUN.3.96.970502175231.245j-100000@enoshima> On Fri, 2 May 1997, Larry Masinter wrote: > Other issues: > The bidi issues for RLT languages in conjunction with > normal punctuation used in and around identifiers. (Will > the identifiers present themselves 'correctly' without > these characters in all cases?) That in an important problem, but should go into a separate draft, because it is basically about display, not about input. > Using UCS in identifiers that are normally "case insensitive" > in ASCII, and the issues, e.g., similar upper-case forms, > the role of accents and equivalence. With "the role of accents", do you mean the French case, where accents may be removed on uppercasing? > I think "white space" or spacing characters in general > need to be addressed. Yes, definitely. They all need to be prohibited. > You need to decide whether you're doing canonicalization/normalization > or just equivalence. I already decided, with the normalization algorithms in the draft. But I guess I need to state it more clearly. > Equivalence is probably easier to define, > and less politically sensitive, even though not as useful. I think equivalence is not useful, because it puts the burdens on software that otherwise doesn't have any clue (and doesn't have to have a clue) about internationalization. Normalization is politically sensitive, but we either get something working or something useless. Regards, Martin.