- From: Roger Hågensen <rescator@emsai.net>
- Date: Sat, 11 Sep 2010 00:51:01 +0200
On 2010-09-09 09:24, Philip J?genstedt wrote: > On Thu, 09 Sep 2010 02:15:27 +0200, David Singer <singer at apple.com> > wrote: > >>> On Wed, Sep 8, 2010 at 3:13 PM, And Clover <and-py at doxdesk.com> wrote: >>>> Perhaps I *meant* to serve a non-video >>>> file with something that looks a fingerprint from a video format at >>>> the top. >>> >>> Anything's possible, but it's vastly more likely that you just made >>> a mistake. >> >> It may be possible to make one file that is valid under two formats. >> Kinda like those old competitions "write a single file that when >> compiled and run through as many languages as possible prints "hello, >> world!" :-). > > For at least WAVE, Ogg and WebM it's not possible as they begin with > different magic bytes. > Then why not define a new "magic" that is universal, so that if a proper content type is not stated then a sniffing for a standardized universal magic is done? Yep, I'm referring to my BINID proposal. If a content type is missing, sniff the first 265 bytes and see if it is a BINID, if it is a BINID check if it's a supported/expected one, and it is then play away, all is good. If a content type is given, then just in case sniff the first 265 bytes and see if it is a BINID, if it is a BINID check if it's a supported/expected one, and it is then play away, all is good. If a content type is missing, and the sniffing of the first 265 bytes shows it is not a BINID or not a supported one, then it can only be treated as unknown binary and would fail (though in the case of a unsupported BINID the user would be shown what the BINID is so they won't be fully stuck if they miss a particular codec or the browser doesn't support it). If a content type is given, and sniffing the first 265 bytes shows it's not a BINID or not a supported one, then treat it as per the context (video or audio) and hope the video or audio codec layer is able to find out what it is (what "should" happen currently right?). It would be very easy to add support for something like BINID as it can be output at the start of a file or stream as the server sends it, a script could even output it or it could be at the start of the actual file itself, and in the case of live streaming a server could easily add it to the start of the stream even if it's mid-stream. Even a wrongly configured webserver wouldn't be able to mess up the handling of this. The benefit is that the browser would see that, Oh, this is a BINID and it's Webm, I'll pass this on to the video codec then. Or if <audio> and the browser sees it is a BINID and it's MP3 it would pass it to the mp3 audio codec. In time something like BINID might even propagate elsewhere beyond just <video> and <audio>. I'm not saying that BINID must be used, but at least something very close to it (as unknown formats can be shown to a human user and make sense and be searchable), and maybe the first 8 bytes should be constructed slightly differently?. Oh and although I haven't tested this, I suspect that most current codecs would ignore the first 265 bytes when they sniff for the start of the data anyway so a BINID would be partially backwards compatible, and in any case certainly easy to patch in support for quite easily. And the best part is that the browser could easily strip or skip past the BINID when passing the data to the OS or codecs (if such do not support BINID at all), or if saving the audio or video locally per user request. Something like BINID (short for Binary Identification actually) is needed, and there is nothing wrong with HTML5 and <video> <audio> standard defining it, it wouldn't be the first time a web standard has been adopted elsewhere later, it would surely see adoption outside of this, I certainly would use it elsewhere. I invented BINID for a reason, because .*** file extensions just isn't good enough, and sniffing binary files is a real pain, the same pain as the <video> and <audio> discussion here is pointing out right now. So if sniffing is bad, but sniffing can't be avoided, then why not simply standardize the sniffing by defining a universal, simple and end user friendly (the BINID can be displayed to the user, even if unknown/unsupported), and the sniffing would be limited to the first 265 bytes (in the case of the BINID proposal), and this limited sniffing can't determine what something is and the context and extra info (like content type) does not clarify what it is or what to do with it then simply fail and inform the user, it doesn't have to be more complicated than that. As simple as possible, but no simpler. Isn't that the ideal mantra of all coders here? Remember, I'm not saying you must use BINID (but hey it's there and fleshed out already), if you must change the name, do so, if you must change the 8 byte sequence, do so, just make sure it has a max length, and the "ID" is humanly disaplayable if the format is unsupported. Just make it into an RFC or something, and spec it in the HTML standard that it must be supported, and spec how to behave if it's not present (like I pointed further above) and it's solved as best as is possible. (unless somebody have an even better idea here that is?) And yeah, this kinda stretched beyond the scope of HTML5 specs, but you'd be swatting two flies at once, solving the sniffing issue with <video> and <audio>, but also the sniffing issue that every OS has had for the last couple of um... decades?! (poke your OS/Filesystem colleagues and ask them what they think of something like this.) Then again, HTML5 is kinda a OS in it's own right, being a app platform (not to mention supporting local storage of databases and files even), so maybe it's not that far outside the scope anyway to define something like this? -- Roger "Rescator" H?gensen. Freelancer - http://EmSai.net/
Received on Friday, 10 September 2010 15:51:01 UTC