Re: Facts about URL Internationalization

Gary Adams - Sun Microsystems Labs BOS (gra@zeppo.east.sun.com)
Mon, 24 Feb 1997 09:03:56 -0500 (EST)


Date: Mon, 24 Feb 1997 09:03:56 -0500 (EST)
From: Gary Adams - Sun Microsystems Labs BOS <gra@zeppo.east.sun.com>
Subject: Re: Facts about URL Internationalization
To: uri@bunyip.com
Message-Id: <libSDtMail.9702240903.20250.gra@zeppo/zeppo>


# Date: Fri, 21 Feb 1997 16:04:29 -0800 (PST)
# From: Chris Newman <Chris.Newman@innosoft.com>
# Subject: Facts about URL Internationalization
# To: IETF URI list <uri@bunyip.com>
# 
# I think there are observable technical facts in this debate which we can
# all agree on:
# 
# 1) URLs are often distributed internationally in hardcopy form.  For
# maximum global usability, such URLs must be restricted to the safe
# characters of the US-ASCII character set.
# 
# 2) Regardless of what the standard says, people do and will continue to
# construct URLs containing unencoded octets above 0x7f.  (As evidence, look
# at violations of 7-bit restrictions in RFC 822, SMTP, NNTP, etc).
# 
# 3) URLs may have a character mapping for octets above 0x7f already
# defined by context.  For example, a URL in a MIME part labelled
# "text/plain; charset=iso-8859-1" will have a character mapping.
# 
# 4) URLs may not have a character mapping for octets above 0x7f already
# defined by context.
# 
# 5) A character mapping for octets represented with the %HH notation is
# currently undefined.

This is the primary reasons that I am in favor of the current proposal to
specify the interpretation of %HH escaped values used to represent characters
above 0x7f within URLs.  The standard should not require any guessing for
insufficiently labeled URIs.  I believe the UTF-8 encoding with the %HH escaping
will support the largest community of users. In the event that additional
information is available (perhaps in the form of URI metadata) alternate
character encodings could be supported to accomodate more specific encoding
schemes.

# 
# 6) One key purpose of Internet Standards is to maximize global 
# interoperability.
# 
# 7) Were the URL standard to specify an interpretation for octet values
# above 0x7f, it should be an international solution.
# 
# 
# Taking all of these into account, I believe Martin Duerst's proposal is
# on the right track.
# 
# 
______________________________________________________________________
Gary R. Adams				Email: Gary.Adams@East.Sun.COM
Sun Microsystems Laboratories   	Tel: (508) 442-0416
Two Elizabeth Drive			Fax: (508) 250-5067
Chelmsford MA 01824-4195 USA		(Sun mail stop: UCHL03-207)
SWAN URL:				http://labboot.East/~gra/
WWW URL:		http://www.sunlabs.com/people/gary.adams/
______________________________________________________________________