Re: Unknown text/* subtypes [i20] from Julian Reschke on 2008-02-14 (ietf-http-wg@w3.org from January to March 2008)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Thu, 14 Feb 2008 14:58:21 +0100
To: Yutaka OIWA <y.oiwa@aist.go.jp>
CC: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <47B448FD.6050204@gmx.de>

Yutaka OIWA wrote:
> ...
> Julian's text (at least implicitly) allows clients to guess any
> character encoding when no charset value is provided, which is

Just for the record: it wasn't "my" text, but the text suggested by our 
working group chair, thinking we actually *did* reach consensus (IMHO 
rightfully, looking at the mailing list archive).

Except for strictly editorial issues, the editors do not have the 
mandate to change specification text without the WG telling them to do so.

> unacceptable for HTTP/1.1 because many existing applications
> immediately become vulnerable due to this change (and existence of
> UTF-7).  I strongly suggest that this must be treated as an
> incompatible normative change, and it is hard to realize even if the
> version number is raised to HTTP/2.0.

I would feel much better if somebody would finally write down *exactly* 
what the UTF-7 vulnerability is...

> Roy's suggestion (and my previous one) is moderate in this scope,
> acceptable within version 1.1 in reality, and almost consistent to
> on-going HTML5 updates which will be actually implemented in real user
> agents.
> 
> Possible my comment to Roy's proposal is
> 
>   - Auto-detection under explicit "iso-8859-1" charset label is dirty.

Yes, it means overriding what should be authoritative metadata.

>     I suggest either drop it or state it as "if required for backward
>     compatibility".  I do not want to give server owners a permission
>     for blindly adding ISO-8859-1 charset label without consulting the
>     real contents any more in future.  I think it will not be
>     supported under HTML5 rules.

But it seems to me this is the whole point of it, isn't it?

>   - We may need some clarification on the definition of whether "the
>     encoding is a superset of US-ASCII".  There are several well-used
>     character encodings which are marginal on this property.

What does being "marginal on this property" mean?

BR, Julian

Received on Thursday, 14 February 2008 13:58:43 UTC