Re: character sets in HTTP: translation tables

I've watched this content encoding discussion for the past few weeks,
waiting for someone to ask this question. No one has, so I will.

What business does a standard for a transport protocol have addressing the
encoding schemes of the various data types it sends between a source and

Specifically, for ALL data types, HTTP is ignorant of the actual content of
the data it conveys from point A to point B. It supports passing a few
header fields between a server and a client that describe this content, but
it doesn't in current practice require ANY dabbling in the content itself.

It seems that there is a mixing of apples and oranges in this discussion.
Aside from the possible trade infringements with Snapple and Fruitopia, it
seems wise to separate the best way to encode text for WWW applications
from the best way to transport that text for WWW applications. People are
mixing WWW client and server implementation issues up with the HTTP
standard. WWW servers are not equivalent to a HTTP protocol engine. Servers
do much more work to find and interpret data before stuffing it into the
HTTP pipe. Making text/* MIME types something that HTTP implementations
have to treat differently than all other data types doesn't seem
consistent. It may be that servers acting as interfaces between a file
system and the HTTP world have to care. But HTTP is data type independent
by its very nature. Is the HTTP standard document devolving into a "How to
build a server" cookbook?

Likewise, HTTP clients don't need to know how to interpret the data they
receive. The VIEWER functions that WWW client software pipes this data into
do need to understand this data, but as with servers, client side HTTP
protocol engines are not equivalent to WWW clients. Clients do a lot more
after the data is received that falls completely outside the scope of the
HTTP protocol.

If the HTML people want to come up with a standard encoding for HTML
documents, or the MIME folks want to further specify encodings for all the
other text/* MIME types, that is fine. I don't understand why the HTTP
standards folks should have to care at all. If some sort of encoding is
needed for request and response headers, yes, I agree that it is a valid
topic of discussion. But given that the HTTP protocol supports a full 8 bit
binary pipe for transfers, issues such as encoding text for transport seem
beyond the scope of the HTTP standard.

I am honestly not trying to split semantic hairs here. I agree that the
mechanisms for standardizing text transport *over* HTTP connections need to
be put in place. I don't agree that the HTTP/1.x standard document is the
place to do this. This seems more properly the province of a RFC describing
a particular encoding scheme which *may* be transported via HTTP or any
other number of protocols, for that matter. (An aside, this is true of all
the image hinting discussions as well.)

I have an ongoing concern that people who may not realize the implications
of their recommendations are continuing to place unreasonable demands on
the server side of HTTP implementations by demanding all sorts of data
conversions in conjunction with the simple transport functions of the
protocol. If all of these proposals become part of the standard, servers
will be so bogged down with translation exercises as to make their
transport tasks impossible.

Chuck Shotton                           "I am NOT here."

Received on Thursday, 29 December 1994 05:53:45 UTC