Handling of MIME types for markup?

As you know, Liam and I have contributed patches to SP, including
improved HTTP support (virtual hosts and redirection).  In the
context of the W3C validator, this applies whenever the validator
has to fetch an external DTD.

One element of this was the use of an "Accept: text/*" header when
SP retrieves a document via HTTP.  The rationale for this is that SP
has no use for non-text documents, and so shouldn't try to fetch them.

In recent correspondence, Liam has suggested that this may not in
fact be the best, or even correct, thing to do:
 (1) Not the best, because it can give rise to confusion when users
     understand markup but not HTTP.
 (2) Not correct, because MIME types are defined for 
     "application/[sg|x]ml" and various other markup.

With regard to (1), I would be reluctant to open the "let's ignore
HTTP because we know better" issue: we've seen just how damaging it
is when Microsoft does the same thing.  OTOH I cannot actually
envisage a case when fetching a non-text document will do anything
more damage than a wasted transaction.  The most obvious effect of
changing it would be to forgive a misconfigured server that sends
DTDs as "application/octet-stream".

Regarding (2), I don't think I understand when it would be right
to use "application/*ml" in preference to "text/*ml".  Perhaps
it would be appropriate in cases where SGML is simply used as
storage for another format:
<!NOTATION JPEG SYSTEM "">
but that kind of thing is clearly not useful to a validator.

So the questions for discussion are:
  (1) What MIME types *should* we accept?
  (2) If we accept an incorrect MIME type, should we report it as
      an error?
  (3) If a server returns an HTTP 406 response (refuses to send a
      document because it shouldn't be acceptable), how should we
      report the error?  See "My page doesn't validate" from
      about March 12th on this list for an example.

-- 
Nick Kew

Desparate to escape the UK right now: can anyone use my skills?
<URL:http://www.webthing.com/~nick/cv.html>

Received on Friday, 6 April 2001 09:10:41 UTC