MIME types for packaged content (was re: tag: uri scheme)

I think it would be much better to allow content types to be
derived by the packager and included in the package on
a file-by-file basis. This was the finding during the
development of MHTML many years ago, and the situation
isn't different here.

There are several operating systems in wide use today
which allow files without extensions to be "sniffed"
on the client. This would also allow the inclusion
of charset parameters in text types, which, in general,
cannot always be easily "sniffed", even if they are
well known in the context of the packager.

While I'm dubious about the arguments on "MIME type
sniffing" for browsers, I think it's completely
unnecessary for packaged content, because of the
explicit "package" step necessary to provide 
conformant content.

(I changed the subject line because the topic isn't about
the 'tag:' URI scheme.)

Larry


-----Original Message-----
From: Marcos Caceres [mailto:marcosscaceres@gmail.com] 
Sent: Friday, February 13, 2009 5:27 AM
To: Larry Masinter
Cc: Bjoern Hoehrmann; public-pkg-uri-scheme; WebApps WG
Subject: Re: tag: uri scheme

Hi Larry,

2009/1/22 Larry Masinter <LMM@acm.org>:
>
>>  https://issues.apache.org/bugzilla/show_bug.cgi?id=13986

>
> Astounding. Thanks for that pointer, hadn't seen that history.
>
> Still, communication of a package is different than communication
> of individual components, because there's an explicit processing
> step which is "create the package". Even if there might be
> some reasons why Apache hasn't fixed their configuration files,
> is there any reason to believe that "create a package" software
> couldn't be configured to always use well-known file extensions
> or (if allowed) well-known content-types?

Our current model in that we are thinking of putting into the widget
packaging spec is:

1. match the file extension to a mime type using the extension to MIME
type tables in the packaging spec.
2. if no match is made, then attempt to sniff the mime type.
3. if no match is made, label the file 'unknown/unknown'.

What we could do is add a step 0, where authors could have an
XML-based (or text based) format for declaring extension to MIME
(e.g., php -> text/html) or overriding default extensions to MIME
mappings.

For example,
<types xmlns="http://www.w3.org/ns/widgets">
  <type ext="php" mime="text/html"/>
  <type ext="jsp" mime="text/html"/>
  <type ext="htm" mime="application/xhtml+xml"/>
</types>

The above elements could either just be part of the configuration
document, or could be in a separate file.

> I'll still claim that the closer you are to the origin of the
> data, the more likely you are going to be able to guess the
> context of the data.

probably true.

-- 
Marcos Caceres
http://datadriven.com.au

Received on Friday, 13 February 2009 22:28:02 UTC