- From: Jim Amsden <jamsden@us.ibm.com>
- Date: Mon, 16 Apr 2001 10:33:04 -0400
- To: ietf-dav-versioning@w3.org, ned@innosoft.com, paf@cicso.com
- Message-ID: <OF6CBB3475.DEC46586-ON85256A30.004CC3C6@raleigh.ibm.com>
Patrik, I took a look at the document referenced below (http://www.ietf.org/internet-drafts/draft-ietf-idn-nameprep-03.txt), and it definitely indicates the issues associated with internationalizing strings. Here's a brief summary: The steps for preparing IDN domain names are: 1. Input the domain name through some mechanism (outside the scope of the document) 2. Do transformations on the input domain name for case folding, hyphen mapping, etc. 3. Normalize the result (see http://www.unicode.org/unicode/reports/tr15/tr15-21.html) 4. Check for prohibited characters and return an error if any. Most of these characters are things that cannot be displayed properly. 5. Resolve the resultant domain name Many of the transformations applied above are specific to domain name semantics. Labels in DeltaV have no associated semantics. They are only keys that are to be used by the user and client applications any way they want. The server applies no semantics to labels other than they are unique among the revisions of the same versioned resource. Uniqueness is determined strictly by byte-by-byte comparisons, not display names. Revisions have a separate display name property for this purpose. There are no language transformations done on labels either. We make no attempt to process them in any way. Now if labels are used for display purposes, which many clients will certainly do, we may run into problems with normalization since it is possible that different labels could be rendered the same, or different clients could take the same rendering and submit different label headers. For maximum flexibility, we would like to leave such concerns to the client applications that are providing the label semantics rather than doing something in the protocol that might not meet all client needs. This is different than preparing internationalized domain names, because the semantics are being specified. A number of members of the DeltaV working group (myself included) have expressed a desire to keep the label header. I think we're tentatively OK on using UTF-8 encoded strings, but I would welcome any further discussion on the topic to be sure we're not going to regret the decision. If simple byte-by-byte comparisons are not sufficient, we could attempt to solve the problem through normalization and mapping rather than simply removing the label header. Patrik Fältström <paf@cisco.com> 04/10/2001 04:02 PM To: Jim Amsden <jamsden@us.ibm.com>, ietf-dav-versioning@w3.org, ned@innosoft.com cc: Subject: RE: label header (was: Re: Versioning TeleConf Agenda, 4/6/01 (Friday) 12-1pm EST) --On 01-04-10 13.21 -0400 Jim Amsden <jamsden@us.ibm.com> wrote: > Perhaps we could get some clarification from the Application area > directors if this is something we need to address or not? I'd prefer to > leave the label header in the spec as it semms that selecting a version > by label is simple and reasonable functionality that is common practice, > and has been in the spec for quite some time. However if it leads down > unnecessary rat-holes, I could be convinced to delete it. Ned? Patrik? IF you do comparisons of any kind, you MUST define how a match is to defined. When comparing Unicode characters, my take is that you have to do things like the IDN wg is talking about. See for example draft-ietf-idn-nameprep-03.txt which talks about the operations you have to do on two strings _before_ you do the actual comparison. So, comparing human readable text is hard, very hard, but possible. paf
Received on Monday, 16 April 2001 10:34:40 UTC