Re: Warnings, RFC 1522, and ISO-8859-1

Martin J. Duerst:
>
>Hello Koen,
>
>Many thanks for your information.
>
>> Martin J. Duerst:
>> >
>> [...]
>> >Now back to the MAIN POINT: Can anybody explain to me why
>> >ISO-8859-1 was choosen as a default for TEXT in headers
>> >and warnings? 
>> 
>> The TEXT encoding was US-ASCII in HTTP/1.0 (RFC1945),
>
>Not true. RFC1945 explicitly allows octets from character sets
>other than US-ASCII (which means octets with the 8th bit set).

Yes, but it does not tell you how to interpret octets with the 8th bit set.
So the bottom line is that you can depend on US-ASCII and nothing else.

[...]
>> but it got changed
>> into ISO-8859-1 for HTTP/1.1 because HTML uses ISO-8859-1.  
>
>HTML 2.0, as of Nov. 1995 (RFC1866) already contained very
>clear language that HTML will move to ISO-10646.

RFC1866 also contains very clear language about HTML user agents supporting
ISO-8859-1 by default, so this is what we took.

I'm sure that, if more i18n specalists had reviewed an commented on the
HTTP/1.1 draft before last call, we could have written a draft better
prepared for i18n.  But there weren't and we didn't.

> Also, there
>is a big difference between entity bodies (where the agreement
>is that "charset" should be labelled as far as legacy browsers
>don't prevent that) and headers (where labeling only makes
>sense for 7-bit email, but is not necessary with UTF-8).

I don't remember that we were aware of a difference between entity bodies and
headers at the time.  What basically happened is that we changed US-ASCII to
ISO-8859-1 for Accept-Charset first.  Then, the other places where a US-ASCII
default was used got edited to reflect this change.  I did not edit the other
places, so you'l have to ask someone else (Jim Gettys?) for the complete
history.

[...]
>>The idea
>> was to sync HTTP with the defaults in HTML, we did not have any i18n
>> considerations in mind.
>
>This sounds to me as if somebody were saying "We were discussing
>passwords - We did not have any security considerations in mind".

Well, you are an i18n specialist, I'm not.  Those who were involved in
creating 1.1 all know that 1.1 is not perfect.  It is as good as we could
make it in the time we had, with the resources we had.

>> As for the Warning header: we did not spend days discussing how to
>> internationalise the warning text field, this was just a micro-decision made
>> by one of the editors along the way. Maybe it was not an optimal decision,
>> but we did not have the time to spend days optimising every micro-decision.
>
>There is really no need to discuss such things for days. The only
>requirement is to make the right decision.

Sorry, the only requirement is rough consensus and running code.

[....]
>I definitely don't want to delay the draft. But if we agree on
>the direction to go in this issue, we can issue a small draft
>(e.g. Encoding of Headers in HTTP) to clear up the issue.
>This should neither delay the IETF process, nor will it delay
>implementations to wait for HTTP 1.2 to do the right thing.

I'm not an expert on this small draft business, but I think it will be
difficult to have a small draft which changes HTTP/1.1 semantics (instead of
just adding to semantics or clarifying semantics) without also having a
procedural delay.  I believe it is ultimately up to the IESG, though.

As for the direction on this issue: I'm not convinced that there is anything
that needs fixing.  I gather that the Warning header definition is extremely
yucky from a i18n standpoint, but that does not justify changing it.  There
are plenty of yucky headers in the protocol, but the headers are not meant
for human consumption, so who cares about taste?  Stability is more
important.

>Regards,	Martin.

Koen.

Received on Monday, 16 December 1996 14:25:01 UTC