RE: label header (was: Re: Versioning TeleConf Agenda, 4/6/01 (Friday) 12-1pm EST)

Patrik,
I took a look at the document referenced below 
(http://www.ietf.org/internet-drafts/draft-ietf-idn-nameprep-03.txt), and 
it definitely indicates the issues associated with internationalizing 
strings. Here's a brief summary:

The steps for preparing IDN domain names are:

1. Input the domain name through some mechanism (outside the scope of the 
document)

2. Do transformations on the input domain name for case folding, hyphen 
mapping, etc.

3. Normalize the result (see 
http://www.unicode.org/unicode/reports/tr15/tr15-21.html)

4. Check for prohibited characters and return an error if any. Most of 
these characters are things that cannot be displayed properly.

5. Resolve the resultant domain name

Many of the transformations applied above are specific to domain name 
semantics. Labels in DeltaV have no associated semantics. They are only 
keys that are to be used by the user and client applications any way they 
want. The server applies no semantics to labels other than they are unique 
among the revisions of the same versioned resource. Uniqueness is 
determined strictly by byte-by-byte comparisons, not display names. 
Revisions have a separate display name property for this purpose.  There 
are no language transformations done on labels either. We make no attempt 
to process them in any way.

Now if labels are used for display purposes, which many clients will 
certainly do, we may run into problems with normalization since it is 
possible that different labels could be rendered the same, or different 
clients could take the same rendering and submit different label headers. 
For maximum flexibility, we would like to leave such concerns to the 
client applications that are providing the label semantics rather than 
doing something in the protocol that might not meet all client needs. This 
is different than preparing internationalized domain names, because the 
semantics are being specified.

A number of members of the DeltaV working group (myself included) have 
expressed a desire to keep the label header. I think we're tentatively OK 
on using UTF-8 encoded strings, but I would welcome any further discussion 
on the topic to be sure we're not going to regret the decision. If simple 
byte-by-byte comparisons are not sufficient, we could attempt to solve the 
problem through normalization and mapping rather than simply removing the 
label header.





Patrik Fältström <paf@cisco.com>
04/10/2001 04:02 PM

 
        To:     Jim Amsden <jamsden@us.ibm.com>, ietf-dav-versioning@w3.org, 
ned@innosoft.com
        cc: 
        Subject:        RE: label header (was: Re: Versioning TeleConf Agenda, 4/6/01 (Friday) 
12-1pm EST)

 

--On 01-04-10 13.21 -0400 Jim Amsden <jamsden@us.ibm.com> wrote:

> Perhaps we could get some clarification from the Application area
> directors if this is something we need to address or not?  I'd prefer to
> leave the label header in the spec as it semms that selecting a version
> by label is simple and reasonable functionality that is common practice,
> and has been in the spec for quite some time. However if it leads down
> unnecessary rat-holes, I could be convinced to delete it. Ned? Patrik? 

IF you do comparisons of any kind, you MUST define how a match is to
defined.

When comparing Unicode characters, my take is that you have to do things
like the IDN wg is talking about. See for example
draft-ietf-idn-nameprep-03.txt which talks about the operations you have 
to
do on two strings _before_ you do the actual comparison.

So, comparing human readable text is hard, very hard, but possible.

   paf

Received on Monday, 16 April 2001 10:34:40 UTC