Re: Open Requirements Issues from Jim Whitehead on 1997-06-19 (w3c-dist-auth@w3.org from April to June 1997)

From: Jim Whitehead <ejw@ics.uci.edu>
Date: Thu, 19 Jun 1997 12:50:27 -0700
To: w3c-dist-auth@w3.org
Message-Id: <afcf3db8050210041049@[128.195.21.209]>

On June 19, Martin J. Duerst wrote:
>> #6 can be accomplished using links between resources.  Links are discussed
>> in the properties draft.
>
>These are some kind of "semantic links", not links in e.g. an Unix
>file system?

Yes.  In the current proposal, a link is an item of metadata which has a
source URI, one or more destination URIs, and a type (which is a name in
the URI namespace).  They are not symbolic links in the Unix file system
sense.

>> To me, the areas within WebDAV which require i18n support are the value
>> fields for properties and version comments, since these will be displayed
>> to human operators of WebDAV clients.   As a result, I think that the
>> following requirement could satisfy WebDAV i18n needs:
>>
>> Internationalization:  All attribute values and version comments must have
>> provisions for storing one or more of the encoding formats specified in
>> ISO10646.
>>
>> Another way might be to leave it more general, and have a statement like:
>>
>> Internationalization:  All attribute values and version comments must have
>> provisions for storing a representation in any human character set.
>>
>> Although I'm not sure that ISO10646 and "any human character set" are
>> identical.  I prefer referring to ISO10646 since it's more concrete, and
>> avoid the issue of which character sets to support.
>
>I think for a requirements document, it is enough to address the question
>of the character repertoire, without reference to encoding issues.
>So how about:
>
>All attribute values and version comments must have provisions for
>storing all characters from the Universal Character Set (ISO 10646).

I like this wording.

>As for actual implementation, using UTF-8 is probably the best solution.
>Having several encoding formats complicates things a lot.

I don't have any experience with implementing a system that uses ISO10646
character set encodings, but in ISP10646 it lists several escape sequences
which can designate which character encoding is being used.  At least on
the surface, it would appear relatively easy to write code that read the
escape sequence, called the appropriate decoder based on the escape
sequence, and then stored everything internally in UCS-4 format. These same
escape sequences are described in Appx. E of the XML spec. -- if we're
going to have simple XML parsers in WebDAV clients and servers, then
implementation of multiple character sets comes along with this.

Or is there some technical difficulty here that would make it desirable to
only use UTF-8?

- Jim

Received on Thursday, 19 June 1997 15:48:21 UTC