[whatwg] Video with MIME type application/octet-stream from Mikko Rantalainen on 2010-09-13 (public-whatwg-archive@w3.org from September 2010)

From: Mikko Rantalainen <mikko.rantalainen@peda.net>
Date: Mon, 13 Sep 2010 16:03:27 +0300
Message-ID: <4C8E211F.4050404@peda.net>
2010-09-11 01:51 EEST: Roger H?gensen:
>  On 2010-09-09 09:24, Philip J?genstedt wrote:
>> For at least WAVE, Ogg and WebM it's not possible as they begin with
>> different magic bytes.
> 
> Then why not define a new "magic" that is universal, so that if a proper
> content type is not stated then a sniffing for a standardized universal
> magic is done?
> 
> Yep, I'm referring to my BINID proposal.
> If a content type is missing, sniff the first 265 bytes and see if it is
> a BINID, if it is a BINID check if it's a supported/expected one, and it
> is then play away, all is good.

From the "what could possibly go wrong" department of thought:

- a web server blindly prefixes files with BINID if it "knows" the file
suffix and as a result, a file ends up with a double BINID (server
assumes that no files contain BINID by default)
- a file has double BINID with contradicting content ids
- some internal API assumes that caller wants BINID in the stream, the
caller assumes that the stream has no BINID - as a result, the caller
will pass content with BINIDs embedded in the middle of stream.

Basically, this sounds like all the issues of BOM for all binary files.

And why do we need this? Because web servers are not behaving correctly
and are sending incorrect Content-Type headers? What makes you believe
that BINID will not be incorrectly used?

(If you really believe that you can force content authors to provide
correct BINIDs, why you cannot force content authors to provide correct
Content-Types? Hopefully the goal is not to sniff if BINIDs seems okay
and ignore "clearly incorrect" ones in the future...)


I'd like to specify that the only cases an UA is allowed to sniff the
content type are:

- Content-Type header is missing (because the server clearly does not
know the type), or
- Content-Type is literal "text/plain", "text/plain;
charset=iso-8859-1", "text/plain; charset=ISO-8859-1" or "text/plain;
charset=UTF-8" (to deal with historical mess caused by IIS and Apache), or
- Content-Type is literal "application/octet-stream"

(In all these cases, the server clearly has no real knowledge. If a file
is meant for downloading, the server should use Content-Disposition:
attachment header instead of hacks such as using
"application/x-download" for Content-Type.)

For any other value of Content-Type, honor the type specified in HTTP
level. And provide no overrides of any kind on any level above the HTTP.
Levels above HTTP may provide HINTS about the content that can be used
to aid or override *sniffing* but nothing should override any
*explicitly specified Content-Type*. [This is simplified version of the
logic that the Mozilla/Firefox already applies:
http://mxr.mozilla.org/mozilla-central/source/netwerk/streamconv/converters/nsUnknownDecoder.cpp#684]

And for heavens sake, do not specify any sniffing as "official".
Instead, explicitly specify all sniffing as UA specific and possibly
suggest that UAs should inform the user that content is broken and the
current rendering is best effort if any sniffing is required.

-- 
Mikko

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20100913/9f65774c/attachment.pgp>
Received on Monday, 13 September 2010 06:03:27 UTC