RE: Labels

From: Jim Whitehead (ejw@ics.uci.edu)
Date: Fri, Feb 18 2000

  • Next message: jamsden@us.ibm.com: "RE: Labels"

    From: Jim Whitehead <ejw@ics.uci.edu>
    To: ietf-dav-versioning@w3.org
    Date: Fri, 18 Feb 2000 17:33:19 -0800
    Message-ID: <NDBBIKLAGLCOPGKGADOJEEGCCOAA.ejw@ics.uci.edu>
    Subject: RE: Labels
    
    
    Tim Ellison writes:
    > To my knowledge, URLs are not internationalized.
    > How do you write a URL with double-byte characters, etc?
    
    There are no standards for how to create internationalized URLs, but there
    is the following Internet-Draft:
    
    http://www.ics.uci.edu/pub/ietf/uri/draft-masinter-url-i18n-04.txt
    "Internationalized Uniform Resource Identifiers (IURI)"
    Larry Masinter, Martin Duerst
    
    I'm not sure what the current status is of this draft.
    
    > I think that the only distinction between labels and revision
    > ids, is that users can define, set, remove, etc. labels.  Making them
    Strings
    > simply adds unnecessary overhead to the spec.  For example, we will have
    to support
    > operations on mixed ascii and Unicode and specified codepage labels,
    > including switching on the fly when dealing with the LABEL XML body and
    > adapting to clients' accept-charset requests.  It's possible, but messy.
    
    Since we're writing an IETF protocol specification, we have to ensure we are
    conformant with the document, "IETF Policy on Character Sets and Languages",
    RFC 2277 <http://www.ietf.org/rfc/rfc2277.txt>.
    
    Requirements from the document that pertain here:
    
       Protocols MUST be able to use the UTF-8 charset, which consists of
       the ISO 10646 coded character set combined with the UTF-8 character
       encoding scheme, as defined in [10646] Annex R (published in
       Amendment 2), for all text.
    
       Protocols MAY specify, in addition, how to use other charsets or
       other character encoding schemes for ISO 10646, such as UTF-16, but
       lack of an ability to use UTF-8 is a violation of this policy; such a
       violation would need a variance procedure ([BCP9] section 9) with
       clear and solid justification in the protocol specification document
       before being entered into or advanced upon the standards track.
    
    Since we're marshalling labels as XML, and since XML already specifies how
    to record the character set encoding being used, as well as the language,
    for the protocol i18n does not add any new marshalling concerns.
    
    - Jim