Re: Invasion of the pseudo-people: character encoding in tedious detail from Gavin Nicol on 1997-06-11 (w3c-sgml-wg@w3.org from June 1997)

From: Gavin Nicol <gtn@eps.inso.com>
Date: Wed, 11 Jun 1997 09:54:11 -0400
To: w3c-sgml-wg@w3.org
Message-Id: <199706111354.JAA22546@nathaniel.eps.inso.com>

>Yes, what I should have said clearer is that the document itself 
>is the most reliable method to describe its encoding.  (This principle 
>has been clearly stated by my colleagues such as Hiyama-san and 
>Matsuda-san, and none of them members of the W3C ML at Keio disagree.)
>
>Servers and proxy servers must only echo what the 
>document says.  

Well, all I can say is that this flies in the face of any sensible
protocol design. 

>Proxy servers with code conversion are disappearing.  

I do not think this is true, but this is beside the point ...

>Servers have no reliable information other than the document.  
>(In the past, DeleGate servers that always attached "charset=ISO-2022-JP"
>caused problem for ASCII documents, said Ishikawa-san at Keio.)

This is not true in all cases. Also, saying "have not" is not equivalent
to "could not": If necessary information is missing, some infrastructure
for specifying it is needed, not a kludge to get around the problem.

Gavin Nicol writes:

>By the way, I heard from Ishikawa-san that 
>RFC 2070 (HTML-I18N) allows the element type "A" to have the CHARSET 
>parameter, but the present version of Cougar does not.

I have almost given up on WWW I18N...

Received on Wednesday, 11 June 1997 09:54:55 UTC