Re: revised "generic syntax" internet draft

Keld J|rn Simonsen (keld@dkuug.dk)
Wed, 16 Apr 1997 00:32:31 +0200


Message-Id: <199704152232.AAA29896@dkuug.dk>
From: keld@dkuug.dk (Keld J|rn Simonsen)
Date: Wed, 16 Apr 1997 00:32:31 +0200
In-Reply-To: John C Klensin <klensin@mci.net>
To: John C Klensin <klensin@mci.net>, Dan Oscarsson <Dan.Oscarsson@trab.se>
Subject: Re: revised "generic syntax" internet draft
Cc: Harald.T.Alvestrand@uninett.no, uri@bunyip.com, fielding@kiwi.ICS.UCI.EDU

John Klensin writes about use of UTF-8 and penalties in size 
and readability for various user communities. Some remarks:

I think the size issue is not important. Consider how many
bytes there are in a package, and the typical round-trip latencies
adding say 5-50 bytes for URLs because of UTF-8 expansion is not 
so important, considering also the frequency of URLs in normal
retrieval of web pages. Performance penalties would be close
to not noticeable IMHO.

You must also weight this against the advantages of UTF-8,
namely a clear and easy migration path for the majority of URLs
today, encoded in US-ASCII: the migration is simply no change.

Maybe John wants to be able to use other charsets for encoding
an URL. I actually proposed some time ago a solution labelling
the encoding of the URL in a "URL-charset:" header and a
having UTF-8 as default, and I remember somebody else also proposing
charset labelling - on the URL line. I have not at this time evaluated 
such proposals compared to Martin and Frangois's proposals, but it
is clear that the intended functionality is the same - and my old
proposal could be seen as an extension to Martin/Frangois - but I
am not sure it is necessary.

Keld