W3C home > Mailing lists > Public > w3c-sgml-wg@w3.org > June 1997

re: Determination of Encoding

From: Michael Sperberg-McQueen <U35395@UICVM.UIC.EDU>
Date: Mon, 23 Jun 97 14:15:39 CDT
Message-Id: <199706231924.PAA12773@www10.w3.org>
To: W3C SGML Working Group <w3c-sgml-wg@w3.org>
On Mon, 23 Jun 1997 14:13:44 -0400 (EDT) Gavin Nicol said:
[quoting me]:
>>The best that can be hoped for is to have some chance at noticing that
>>there is a discrepancy -- particularly important given the frequency
>>with which transcoders garble the data (at least ASCII/EBCDIC
>>transcoders do -- perhaps the transcoders for CJK character encodings
>>work flawlessly all the time).
>>
>>To do that, you need to have the PI retained.
>
>Most receiving systems will be able to parse the PI and detect the
>difference, sure. The problem is that the trancoding *server* cannot
>stop them from getting false negatives unless it rewrites the PI. The
>probability of HTTP being changed to require this for XML is
>vanishingly small. I believe it to also be vanishingly small for any
>MIME based protocol (including email).

As has been pointed out (by me, last fall), even rewriting the MIME
headers is not always performed correctly -- especially in email.
My incoming email, on this EBCDIC machine, is full of MIME-encoded mail
claiming, in its MIME headers, to be in ASCII.

No external label is ever foolproof.

No internal label is ever foolproof.

Fools are just too doggone ingenious.

Under these circumstances, I don't see the point in *requiring* any
processor to prefer the internal to the external label, or vice versa.
Anyone who argues that one of these will always be right in cases of
conflicts must be living in a world rather unlike mine.

>Taking the failure cases and making them canonical doesn't remove the
>problem: it just increases the number of failures.

This is probably an argument against the proposal recently mooted, to
declare inconsistency between the internal and external labels a
well-formedness error.  I'd be inclined to continue to define it as
an error, however -- but it should be an error from which it's possible
to recover.

-C. M. Sperberg-McQueen
Received on Monday, 23 June 1997 15:24:30 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 10:04:44 EDT