- From: Gavin Nicol <gtn@ebt.com>
- Date: Tue, 17 Sep 1996 17:36:06 GMT
- To: U35395@UICVM.CC.UIC.EDU
- CC: w3c-sgml-wg@w3.org
>>>1) File can be valid SGML (as external general entity) >> >>True for systems that understand the hack. For systems that do not > >All SGML systems understand how to parse PIs, and all SGML systems >rely on the SGML declaration (or a hard-coded character set) to >understand character encodings, so I think Rick is right; the >file can be valid SGML, period. The SGML system does not need >any understanding of the hack. This is actually *not* true. SGML systems *cannot* parse the PI declaration until they know the coded character set and encoding in effect. The coded character set will be fixed, but the encoding not (even in the minimalist case). As such, only systems that can, in some way, derive the encoding, parse the PI. Rick's proposal requires some part of the system to sniff at the head of the file to figure out the encoding. >Since tags are clearly metadata in the broad sense of the term, I think >this thesis will have a tough time commanding universal assent in a >group full of SGML partisans. At least, I hope so. I don't object to >external metadata, but it is remarkably fragile and it is remarkable how >easily it goes out of date. Internal metadata is perceptibly less >fragile and goes out of date less easily, in my experience -- probably >because internal metadata is there when changes are made. Humans are >less prone to forget it, and programs are less likely to be unable to >find it for updating. One more reason why this idea is BAD (Broken As Designed): email gateways and proxy servers *may* convert the encoding of the document blindly. Most gateways and proxies understand MIME headers, but very few understand how to parse, and rewrite the entity to correct the encoding label on the PI.
Received on Tuesday, 17 September 1996 13:37:21 UTC