- From: Mikko Rantalainen <mikko.rantalainen@peda.net>
- Date: Mon, 13 Sep 2010 16:03:27 +0300
2010-09-11 01:51 EEST: Roger H?gensen: > On 2010-09-09 09:24, Philip J?genstedt wrote: >> For at least WAVE, Ogg and WebM it's not possible as they begin with >> different magic bytes. > > Then why not define a new "magic" that is universal, so that if a proper > content type is not stated then a sniffing for a standardized universal > magic is done? > > Yep, I'm referring to my BINID proposal. > If a content type is missing, sniff the first 265 bytes and see if it is > a BINID, if it is a BINID check if it's a supported/expected one, and it > is then play away, all is good. From the "what could possibly go wrong" department of thought: - a web server blindly prefixes files with BINID if it "knows" the file suffix and as a result, a file ends up with a double BINID (server assumes that no files contain BINID by default) - a file has double BINID with contradicting content ids - some internal API assumes that caller wants BINID in the stream, the caller assumes that the stream has no BINID - as a result, the caller will pass content with BINIDs embedded in the middle of stream. Basically, this sounds like all the issues of BOM for all binary files. And why do we need this? Because web servers are not behaving correctly and are sending incorrect Content-Type headers? What makes you believe that BINID will not be incorrectly used? (If you really believe that you can force content authors to provide correct BINIDs, why you cannot force content authors to provide correct Content-Types? Hopefully the goal is not to sniff if BINIDs seems okay and ignore "clearly incorrect" ones in the future...) I'd like to specify that the only cases an UA is allowed to sniff the content type are: - Content-Type header is missing (because the server clearly does not know the type), or - Content-Type is literal "text/plain", "text/plain; charset=iso-8859-1", "text/plain; charset=ISO-8859-1" or "text/plain; charset=UTF-8" (to deal with historical mess caused by IIS and Apache), or - Content-Type is literal "application/octet-stream" (In all these cases, the server clearly has no real knowledge. If a file is meant for downloading, the server should use Content-Disposition: attachment header instead of hacks such as using "application/x-download" for Content-Type.) For any other value of Content-Type, honor the type specified in HTTP level. And provide no overrides of any kind on any level above the HTTP. Levels above HTTP may provide HINTS about the content that can be used to aid or override *sniffing* but nothing should override any *explicitly specified Content-Type*. [This is simplified version of the logic that the Mozilla/Firefox already applies: http://mxr.mozilla.org/mozilla-central/source/netwerk/streamconv/converters/nsUnknownDecoder.cpp#684] And for heavens sake, do not specify any sniffing as "official". Instead, explicitly specify all sniffing as UA specific and possibly suggest that UAs should inform the user that content is broken and the current rendering is best effort if any sniffing is required. -- Mikko -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 198 bytes Desc: OpenPGP digital signature URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20100913/9f65774c/attachment.pgp>
Received on Monday, 13 September 2010 06:03:27 UTC