Re: Comments on Part 1: Encoding declaration from Gavin Nicol on 1997-06-03 (w3c-sgml-wg@w3.org from June 1997)

From: Gavin Nicol <gtn@eps.inso.com>
Date: Tue, 3 Jun 1997 13:55:19 -0400
To: ricko@allette.com.au
CC: murata@apsdc.ksp.fujixerox.co.jp, w3c-sgml-wg@w3.org
Message-Id: <199706031755.NAA07770@nathaniel.ebt>

>> Interestingly, there is increasing support for *correct* server
>> labelling. In the immediate future, (B) is a more likely scenario,
>> simply because admin folks either are 1) lazy, or 2) ignorant. This is
>> changing. Even today, you can configure a server to correctly label
>> things. I expect that as the metadata work proceeds, we'll see more
>> and better solutions.
>
>You haven't answered my point, which that a server needs some strategy 
>to detect charset if it holds files of more than one charset. I don't think a 
>server has the time to go through the whole XML detection process if it
>is sending out thousands of files: it will use some sub-XML heuristic. So the
>client-side detection of charset using the XML chain will always be more
>reliable.  

I really must disagree strongly. With correct server-side labelling,
heuristics on the client side are not necessary. Heuristics mess up a
whole lot of things, including searching.

Also, if you have protocol-level labelling, things like transcoding
proxies, or caching proxies, become far more reliable, and they take
less of a performance hit.

>I don't think it is a problem of ignorance or laziness, but of minimising the
>processing done by servers.

Current servers do far more work than you may think. Even the cost of
managing network connection is a non-trivial in most current servers,
as is per-request filtering.

Most servers do have, or soon will have, some way of specifying
encoding directly (e.g. Apache .asis). The cost of using such
mechanisms is certainly far less than the cost of using server-side
includes, for example.

Received on Tuesday, 3 June 1997 13:56:07 UTC