- From: Gavin Nicol <gtn@eps.inso.com>
- Date: Mon, 23 Jun 1997 14:12:50 -0400
- To: w3c-sgml-wg@w3.org
>>>Consider a proxy server that performs code conversion without rewritting >>>the PI. Consider a WWW browser or robot that does not understand XML. >>>Such browsers or robots certainly exist now and will not disappear in >>>the near future. If they save a transfered XML document in a file, >>>the header information will disappear and the PI will remain incorrect. >>>Then, an XML parser is likely to fail. >> >>Precisely why I way that we must rely on HTTP header. I'm starting to >>think that Rick's proposal of requiring servers to remove the PI >>is a good idea. > >How will relying on the external header fix matters? The problem is >that it is always possible to get a transcoding server that doesn't >understand the format it's transcoding (one reason sending binary files >via Bitnet was always such an adventurous experience if one of the nodes >involved was an ASCII site). In the context of HTTP, the charset parameter on the Content-Type field is the only thing that can be used to correctly detect the encoding. >The best that can be hoped for is to have some chance at noticing that >there is a discrepancy -- particularly important given the frequency >with which transcoders garble the data (at least ASCII/EBCDIC >transcoders do -- perhaps the transcoders for CJK character encodings >work flawlessly all the time). > >To do that, you need to have the PI retained. Most receiving systems will be able to parse the PI and detect the difference, sure. The problem is that the trancoding *server* cannot stop them from getting false negatives unless it rewrites the PI. The probability of HTTP being changed to require this for XML is vanishingly small. I believe it to also be vanishingly small for any MIME based protocol (including email). Taking the failure cases and making them canonical doesn't remove the problem: it just increases the number of failures.
Received on Monday, 23 June 1997 14:13:32 UTC