- From: Jim Amsden <jamsden@us.ibm.com>
- Date: Mon, 16 Apr 2001 10:33:04 -0400
- To: ietf-dav-versioning@w3.org, ned@innosoft.com, paf@cicso.com
- Message-ID: <OF6CBB3475.DEC46586-ON85256A30.004CC3C6@raleigh.ibm.com>
Patrik,
I took a look at the document referenced below
(http://www.ietf.org/internet-drafts/draft-ietf-idn-nameprep-03.txt), and
it definitely indicates the issues associated with internationalizing
strings. Here's a brief summary:
The steps for preparing IDN domain names are:
1. Input the domain name through some mechanism (outside the scope of the
document)
2. Do transformations on the input domain name for case folding, hyphen
mapping, etc.
3. Normalize the result (see
http://www.unicode.org/unicode/reports/tr15/tr15-21.html)
4. Check for prohibited characters and return an error if any. Most of
these characters are things that cannot be displayed properly.
5. Resolve the resultant domain name
Many of the transformations applied above are specific to domain name
semantics. Labels in DeltaV have no associated semantics. They are only
keys that are to be used by the user and client applications any way they
want. The server applies no semantics to labels other than they are unique
among the revisions of the same versioned resource. Uniqueness is
determined strictly by byte-by-byte comparisons, not display names.
Revisions have a separate display name property for this purpose. There
are no language transformations done on labels either. We make no attempt
to process them in any way.
Now if labels are used for display purposes, which many clients will
certainly do, we may run into problems with normalization since it is
possible that different labels could be rendered the same, or different
clients could take the same rendering and submit different label headers.
For maximum flexibility, we would like to leave such concerns to the
client applications that are providing the label semantics rather than
doing something in the protocol that might not meet all client needs. This
is different than preparing internationalized domain names, because the
semantics are being specified.
A number of members of the DeltaV working group (myself included) have
expressed a desire to keep the label header. I think we're tentatively OK
on using UTF-8 encoded strings, but I would welcome any further discussion
on the topic to be sure we're not going to regret the decision. If simple
byte-by-byte comparisons are not sufficient, we could attempt to solve the
problem through normalization and mapping rather than simply removing the
label header.
Patrik Fältström <paf@cisco.com>
04/10/2001 04:02 PM
To: Jim Amsden <jamsden@us.ibm.com>, ietf-dav-versioning@w3.org,
ned@innosoft.com
cc:
Subject: RE: label header (was: Re: Versioning TeleConf Agenda, 4/6/01 (Friday)
12-1pm EST)
--On 01-04-10 13.21 -0400 Jim Amsden <jamsden@us.ibm.com> wrote:
> Perhaps we could get some clarification from the Application area
> directors if this is something we need to address or not? I'd prefer to
> leave the label header in the spec as it semms that selecting a version
> by label is simple and reasonable functionality that is common practice,
> and has been in the spec for quite some time. However if it leads down
> unnecessary rat-holes, I could be convinced to delete it. Ned? Patrik?
IF you do comparisons of any kind, you MUST define how a match is to
defined.
When comparing Unicode characters, my take is that you have to do things
like the IDN wg is talking about. See for example
draft-ietf-idn-nameprep-03.txt which talks about the operations you have
to
do on two strings _before_ you do the actual comparison.
So, comparing human readable text is hard, very hard, but possible.
paf
Received on Monday, 16 April 2001 10:34:40 UTC