XHTML, XML, fancy text, and applications

Mikko Rantalainen wrote to <mailto:www-html@w3.org> on 9 April 2003 in "Re:
XHTML2 MIME type" (<mid:3E94185F.10003@cc.jyu.fi>):

> [This is getting a bit offtopic and I'm not really expecting any
> replies. I'll post some thoughts to help future archive diggers.]

I'm steering the discussion into the MIME media types list,
<mailto:ietf-types@alvestrand.no>. Please send public replies there.

> OK. This is the first time I actually viewed RFC 3023 and I want to say
> that I consider "+xml" extension as an ugly hack.

I consider it an elegant solution. Our difference leads me to conclude that
this is at least partly a matter of taste. De gustibus non disputandum.

> I'm still wondering why they choose
> to use "+" as a separator if the meaning is "this file can be considered
> as something OR xml".

I would see it as "This resource is of such-and-such type AND it is XML in
terms of syntax."

> When I first time saw application/xhtml+xml I
> immediatly thought that it meant it's an xhtml file with possible
> additional namespaces.

While I affirm the principle that MIME media type names should be as clear
as is possible within the limited character repertoire and some reasonable
length, we have IANA registrations and Requests For Comments that define
each type. Those documents are meant for reading, not for ignoring.

> As I have some programming background I think
> application/xhtml|xml would have been much better and the pipe was
> available in addition to the plus sign.

As I know that non-programmer lay people encounter MIME media type names
with some frequency on the Web (by "Web" I'm including mail systems), I have
to say that the pipe character ("|", vertical line, U+007C) is too devoid of
well-known semantics. In my experience in the English-speaking United
States, if there is any character that commonly denotes alternatives, it is
the slash ("/", solidus, U+002F). (Perhaps, in that case, I should have
written slash/solidus.) The slash is reserved and will not find its way into
a MIME media subtype name.

> After reading the references you provided I still feel that we need a
> new top level mime type. We have various file types that are basically
> text but not plain text.

Right, and those should be subtypes of "text" if they can reasonably be
treated as "text/plain", or subtypes of "application" if they contain
non-textual markup.

> If the reason for not having another top level MIME type for xml/* is
> that we want to specify TYPE instead of SYNTAX then the text/* shouldn't
> be considered as plain text syntax either and it should be used for all
> file types that mostly contain text.

I think that the principle in effect is that treatment as "text/plain" has
to be a reasonable fallback for any "text" subtype. What constitutes
reasonable is a matter for debate. I suspect that most people could not make
sense of Postscript source code (hence "application/postscript"), even
though it is textual. HTML source code, on the other hand, is likely to
contain passages of readable text (hence "text/html"), even if it leaves the
reader wondering what all the symbols mean.

> I think the application/* top level type shouldn't be used for XHTML 2
> just because one needs an application to easily read the content.

My understanding is that the MIME media type name "application" describes
the content itself, not the necessary processing software. It's like "This
resource is an application of some sort", not like "Start an application to
handle this resource".

> Following the same logic we should move all of image/*, video/* and
> audio/* types to application/* because you cannot view any of those
> without an application either.

Again, the term "application" describes the resource, not the handler.

> Perhaps application/* should be renamed to misc/* or other/*?

That sounds like a terrible idea, given the installed base of software that
understands and expects the "application" name.

-- 
Etan Wexler: stuffed but not satiated, damn it.
 <mailto:ewexler@stickdog.com>

Received on Friday, 11 April 2003 20:39:04 UTC