[Prev][Next][Index][Thread]

Re: Limited modified eclectism (was Re: Reads like ASCII)



>>>1) File can be valid SGML (as external general entity)
>>
>>True for systems that understand the hack. For systems that do not
>
>All SGML systems understand how to parse PIs, and all SGML systems
>rely on the SGML declaration (or a hard-coded character set) to
>understand character encodings, so I think Rick is right; the
>file can be valid SGML, period.  The SGML system does not need
>any understanding of the hack.

This is actually *not* true. SGML systems *cannot* parse the PI
declaration until they know the coded character set and encoding in
effect. The coded character set will be fixed, but the encoding not
(even in the minimalist case). As such, only systems that can, in some
way, derive the encoding, parse the PI. Rick's proposal requires some
part of the system to sniff at the head of the file to figure out the
encoding. 

>Since tags are clearly metadata in the broad sense of the term, I think
>this thesis will have a tough time commanding universal assent in a
>group full of SGML partisans.  At least, I hope so.  I don't object to
>external metadata, but it is remarkably fragile and it is remarkable how
>easily it goes out of date.  Internal metadata is perceptibly less
>fragile and goes out of date less easily, in my experience -- probably
>because internal metadata is there when changes are made.  Humans are
>less prone to forget it, and programs are less likely to be unable to
>find it for updating.

One more reason why this idea is BAD (Broken As Designed): email
gateways and proxy servers *may* convert the encoding of the document
blindly. Most gateways and proxies understand MIME headers, but very
few understand how to parse, and rewrite the entity to correct the
encoding label on the PI.


Follow-Ups: References: