Content-Document-Type: was (Re: MIME types vs. DOCTYPE) from Jonathan Borden on 1999-02-26 (www-html-editor@w3.org from January to March 1999)

From: Jonathan Borden <jborden@mediaone.net>
Date: Thu, 25 Feb 1999 20:38:44 -0500
To: "Rick Jelliffe" <ricko@allette.com.au>, "xml mailing list" <xml-dev@ic.ac.uk>
Cc: <www-html-editor@w3.org>
Message-ID: <000101be6128$bf4af7a0$d3228018@jabr.ne.mediaone.net>

	I am proposing that xml continue to use the media type:

text/xml or
application/xml

	and that the header

Content-Document-Type:

be used to denote the particular document type. The value of
Content-Document-Type is proposed to be a URI, in the same fashion that a
URI specifies an XML namespace.

Rick Jellife wrote:

>
>
> The format of MIME types is given in Freed and Borenstein
>  Multipurpose Internet Mail Extensions   (MIME) Part Two:
> Media  Types( ftp://ftp.isi.edu/in-notes/rfc2046.txt)
>
> That RFC allows anyone to go
>     text/X-???
>     application/X-???
> where ??? is any name you like. For example, text/X-xml-vml or whatever.

	yes, so unless registered via the IANA text/xhtml is not legal wrt RFC
2046. Official XML document types will need to be registered with the IANA,
the W3 does not control the MIME content-type domain. Overloading
Content-Type in such fashion is bad for several reasons:

1) official document types need to be registered via the IANA

2) there exists the same problem with namespace collisions using the
unofficial text/x-subtype specification that would exist if XML namespaces
were to be defined by the prefix and not the URI. Use of text/x-xml-xxx is
not a robust document specification scheme in the situation where thousands
or millions of distinct document types may be defined.

and Dave Megginson wrote:

> But I do think that you should make an exception for every document
> type -- text/xml should just be a fallback, when all else has failed.

> Why should there be a single processor to handle everything that
> happens to be encoded in XML?  I don't have a single compiler for
> every programming language that happens to use ASCII, or a single
> application that processes any data that arrives in a zip file.

	True but MIME types denote specific encodings such as application/x-gzip.

>If I have a vector graphic format that happens to use XML, I want to
>pass it off to a vector-graphic processor; if I have a browsable
>document, I want to pass it off to a browser; if I have a 3D world, I
>want to pass it off to a 3D renderer; if I have an e-commerce
>transaction, I want to pass it off to my order-processing application;
>etc., etc.

	This is a reasonable request. One could make the argument that standards
which happen to be XML ought be described by unique and registered MIME
types. The problem is with the proliferation of type names. The other and
perhaps more important problem is that no MIME type denotes XML data as
opposed to TEXT data, so in the absence of specific knowledge of a
particular media type, text/xxx is to be treated as text/plain per RFC 2046.

	Perhaps xml ought be a top level type, but then ought sgml be as well and
other well known text formats?

>I can imagine many circumstances where parsing the XML first to figure
>out what it is could be useful, but if it is already possible to know
>the type, then doing so is very wasteful.

	This is also reasonable and Content-Document-Type solves this problem. I
see no good reason not to use a specific header to solve this problem.


Jonathan Borden

http://jabr.ne.mediaone.net

Received on Thursday, 25 February 1999 20:44:03 UTC