W3C home > Mailing lists > Public > ietf-http-wg-old@w3.org > May to August 1996

Re: charset flap

From: Robert S. Thau <rst@ai.mit.edu>
Date: Thu, 27 Jun 1996 17:30:19 -0400
Message-Id: <199606272130.RAA06417@hershey.ai.mit.edu>
To: erik@netscape.com, masinter@parc.xerox.com
Cc: fielding@liege.ICS.UCI.EDU, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
X-Mailing-List: <http-wg@cuckoo.hpl.hp.com> archive/latest/987
> So you're suggesting that the server actually parse the output of the
> CGI program to check for charset, and, if absent, add charset? Are
> server implementors prepared to take this performance hit? Or are they
> already parsing CGI output for other reasons?

Actually, they are (unless the script is of the nph- type, in which
case it just gets the raw socket on standard output and has to format
the complete response entirely on its own --- this is *not* the usual

For example, servers implementing CGI are required to set the status
line of the actual response to the value of the "Status:" header of
the script response, if one was supplied; to switch default status
returned from 200 to 302 (perhaps this should be 303 for scripts
invoked via POST?) if Location: was set by the script and Status: was
not; and to add Date:, Server:, and perhaps other headers to those
supplied by the script.  (The Location: business is there because the
presence of a Location: header cued the server that the script wanted
to force a redirect in early CGI versions, before the Status CGI
header existed).  The headers written to stdout by the script cannot
just be sent en bloc to the client.

I'm not saying here that adding charset to the Content-type header in
responses to post-HTTP/1.0 clients is a model of elegance, but it
isn't unique.  There are other upwards-not-quite-compatible tweaks of
this type in the current spec --- for instance, there's one de facto
set of rules for negotiating persistent connections with an HTTP/1.0
client, and a completely different set of de jure rules for HTTP/1.1,
not to mention the Host: business.  The amount of code required to
implement each of these political compromises isn't very great, but
collectively, they are starting to add up.

And while I'm not a fan of these tweaks generally, I don't find the
charset business any more or less objectionable, on its own merits,
than any of the others (though I'm hardly a MIME guru, and may be
missing some critical issues one way or the other there).

In particular (as the author of the Apache CGI code --- or at least
the retrofitter from the NCSA stuff ;-), I can say with some assurance
that if Apache were made to support this business for ordinary files,
then CGI scripts would come out in the wash without further effort;
the same code (send_http_header) would have to do the same thing in
either case.

If someone wants to say that this is the place to draw the line
against *all* of these upward-not-quite-compatibilities, or just that
it's time to stop tweaking the damn thing and call it finished, there
might be reason to that, but politically, it seems unwise to choose as
the particular scapegoat item something known to be a hot-button issue
with the IESG...

Received on Thursday, 27 June 1996 14:40:39 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:40:17 UTC