Re: Comments on the HTTP/1.0 draft. from Marc VanHeyningen on 1994-12-08 (ietf-http-wg@w3.org from October to December 1994)

From: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
Date: Thu, 08 Dec 1994 10:12:43 -0500
To: Gavin Nicol <gtn@ebt.com>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <27521.786899563@silky.cs.indiana.edu>

> Um. Please define "canonical text form".

I have already done so several times.  The canonical form for text is defined
in RFC 1521 (which just cites 822, of course) as CRLF delimited for US-ASCII
or ASCII-like things like 8859-1.  I do not know of a single Internet standard
protocol that does not employ this representation; let me know if you know
of one.

If you want to say that using this form is unneeded, OK, but please don't
say there isn't one.

Unicode, of course, is newer and doesn't have decades of developing canonical
forms behind it, so things are less clear for such cases.  But we weren't
talking about Unicode, though obviously we don't want a solution that could
screw things up for it in the future.

> My proposal for dealing with this in HTTP is to have a seperate field
> for charset negotiation, and to ship Unicode (UTF) (marked up with
> some languages/presentational tags that are autogenerated) as the
> "canonical" form into which everything can be converted into and from.

Sounds interesting as a long-range approach, though I don't think it sounds
simple enough to just drop into place.

Received on Thursday, 8 December 1994 07:13:58 UTC