Re: [widgets] Content-type sniffing and file extension to MIME mapping

On 2.12.2008 18.29, "ext Marcos Caceres" <marcosscaceres@gmail.com> wrote:
> On Tue, Dec 2, 2008 at 3:19 PM, Jere Kapyaho <jere.kapyaho@nokia.com> wrote:
> <snip>
>> Yes, it's .flac (or .fla in a pinch) for FLAC.
> Oh oh! .fla is a clash with Adobe flash files:(

Well, that would have been for truly legacy systems only. :) However, when
multiple file extensions map to the same MIME type, there could be other
conflicts like this.

> I personally, don't think we should define a format that forces
> developers to include every file just to cover the use case where a
> file extension is missing.

If the extension is missing, it could be on purpose. Or it could be there,
but it could be just plain wrong, or ambiguous (think .jpg vs. .jpeg, or
.htm vs. .html). The concept I envisioned is somewhat similar to the index
of a JAR file. [1].

> Also, I assume that some tool would be
> needed to generate this metadata list, as I don't see any developer
> ever doing this by hand because:
> 
>    1. a widget could contain hundreds of files.
>    2. file names with spaces, and possibly other characters, would
> need to be URL encoded.
>    3. it would be tremendously error prone and hard to maintain.
>    4. developers would wonder why this is not done automatically by
> the widget engine, when they've never had to do it with any other
> widget engine before.

If a widget has hundreds of files, nobody would try to do it by hand anyway
(point #1). If the filename is UTF-8 and defined as a relative URI inside
the package, it will have to be UTF-8-ified and URL-encoded. (point #2). I
guess I envisioned a tool doing the assembly anyway (point #3).

> If we were going to add a mimetype override file, I would argue we
> should only do it based on file extensions.

Note that I'm not pushing the method I described as *the* solution, but to
me only point #4 of those above is critical. File extensions are by nature
unreliable and ambiguous, but very commonly used as a way (or even the only
way) of recognizing content. A more immediate problem in terms of the spec
is that you will need to come up with all the 'important' file extensions up
front, and the list will need to be updated later, perhaps frequently,
depending on how exhaustive the initial list was.

But the Apache style extension to MIME type mapping probably works
adequately in this context also.

[1] http://java.sun.com/j2se/1.3/docs/guide/jar/jar.html#JAR%20Index

--Jere

Received on Wednesday, 3 December 2008 09:13:25 UTC