Re: [widgets] Content-type sniffing and file extension to MIME mapping

Hi Jere,

On Tue, Dec 2, 2008 at 3:19 PM, Jere Kapyaho <jere.kapyaho@nokia.com> wrote:
<snip>
> Yes, it's .flac (or .fla in a pinch) for FLAC.
>

Oh oh! .fla is a clash with Adobe flash files:(

> I was going to suggest to add .aac and .mp4, but if patented formats are
> out, then I won't. However, isn't .swf equally patented, or has it been
> liberated recently? (Or GIF? PDF?) It may not be feasible to discriminate
> some file formats on that basis, especially if they do have a registered
> MIME type.
>

I don't know if it should be based on patented formats being out. I
think it should be based on what we can show to be the core
technologies that make widgets usable and interoperable. If, for
instance, implementers are happy to foot the bill for technology foo
because it provides something that developers want and commonly use,
then we should probably include it.

> I will nevertheless suggest the audio formats .ogg (open) and also .aiff.
> And if patents shouldn't matter, also .mov, .wmv and .mp2.
>
> But... maybe relying on file extensions is not the way to do it after all.
> Extensions are not reliable, and not even mandatory on some systems. Since
> the widget package is in a sense "sealed", a metadata list that connects
> each filename inside a package to a MIME type would work with or without
> file extensions. Or has this been discussed and dismissed already?
>

We've talked about this informally for a while (mostly at F2Fs or in
IRC, but I don't think we have ever really discussed it via the public
list).

> Example metadata list (format is simply <filename> WSP <mimetype>):
>
> images/splashScreen.png image/png
> music/themesong.flac audio/flac
> favicon.ico image/vnd.microsoft.icon
> main application/xhtml+xml
>
> Note especially the last item, which has no file extension at all.

I personally, don't think we should define a format that forces
developers to include every file just to cover the use case where a
file extension is missing. Also, I assume that some tool would be
needed to generate this metadata list, as I don't see any developer
ever doing this by hand because:

   1. a widget could contain hundreds of files.
   2. file names with spaces, and possibly other characters, would
need to be URL encoded.
   3. it would be tremendously error prone and hard to maintain.
   4. developers would wonder why this is not done automatically by
the widget engine, when they've never had to do it with any other
widget engine before.

If software must be created to derive the MIME types and generate the
metadata file, either through sniffing or through looking at the file
extension, then I think such a tool should just be part of the widget
engine. Note that such software has been created (see Linux's "file"
util [1]), and, in some cases, Apache uses similar functionality to
derive MIME types [2].

If we were going to add a mimetype override file, I would argue we
should only do it based on file extensions.

I still believe the spec should:

  1. define the mappings for file extension to MIME, which all engines
must use.
  2. in the case there is strong support from working group members
for adding a mimetype override format, the spec include a default
override file that all widget engines are expected to use.

In the case of 2 above, I would _not_ want us to define yet another
XML format. I think we should just have a very simple text-based
format that simply looks like this (based loosely on Apache's
addType):

text/html .php

The file could be called "mimetypes" or "mime.types" and sit at the
root of a widget package.

Kind regards,
Marcos

[1] http://httpd.apache.org/docs/1.3/mod/mod_mime.html
[2] http://httpd.apache.org/docs/1.3/mod/mod_mime_magic.html
-- 
Marcos Caceres
http://datadriven.com.au

Received on Tuesday, 2 December 2008 16:29:57 UTC