[Prev][Next][Index][Thread]

Re: B.1 and B.2 results



>OK, I see now.  You are suggesting that we put a MIME header in the
>document in all cases.  I think this is an excellent suggestion.

.... this is *precisely* what my *.mim file format (suggested to
HTML-WG and also out in an expired RFC) *is*.

>Note that many existing web servers (including Apache) cope with
>files containing MIME headers, and may even emit those headers in
>response to an HTPP HEAD request.  Apache is said (independently) to
>represent over 30% of all running web servers.

Right, but the *.mim file format is different to Apache (or at least
the last version I looked at) in that Apache sends the file *verbatim*
and does not necessarily add missing headers... which means that the
author must understand the entire set of required headers. The
proposal I put forth only requires headers that will be overriding
those generated by the server.

As I noted before on this list, and also in HTML-WG, most software
that will be dealing with the WWW will *already* have MIME header
parsers built into them.... probably as a message stream module, so
you can *reuse* that code for the local and distributed case.

Again, I seem to be talking to myself.

>It's always a little tricky to talk about mixing character sets within
>a single file.  However, since MIME headers are in US ASCII (or is
>Latin 1 allowed now?), the headers must be in the subset common to
>both.

The headers are in US-ASCII, which is a nuisance of your file is UCS-2
(your editor would need to have MIME parsing capabilities built in),
which is a boundary case, but an important one. This is one reason I
prefer catalog or FSI based solutions. In most practical situations,
this will not be an overly large concern though.

>At a minimum, you would need
>    Mime-version: 1.0
>    Content-type: text/x-xml;version=1.0;charset=utf-8?

In the *.mim file format, the minimum you would need would be CRLF,
and for non-ISO-8859-1 documents

    Content-type: text/x-xml;charset=shift-jis

>Instead of requiring the full MIME CR-LF at the end of each line (which
>is a pain to mantain on some platforms, e.g. Mac and Unix), I would
>suggest documenting a format in which
...

I would just reference the HTTP specs (though HTTP 1.1 is becoming
more restrictive), though I could easily be convinced that strict MIME
compatability be preserved.

>You then get a header format which can easily and reliably be edited
>on multiple platforms -- e.g. you can upload a file from your PC to
>a Unix, NT or Mac server, and make a quick change in Notepad, Sam, or
>whatever, without trashing the file header.  Of course, the editor
>has to be able to write out the body of the file correctly!

The point I've tried to make before!!!

The PI hack is a HACK. It is a header hiding under syntax that will
confuse everyone, or at least cause people to assume that you could do
something clever like:

<?XML-CHARSET SJIS>
....
<?XML-CHARSET BIG5>
....
<?XML-CHARSET UTF8>

and we all know *that* is totally bogus.


References: