Re: comments on draft-barth-mime-sniffing from Boris Zbarsky on 2009-06-19 (ietf-http-wg@w3.org from April to June 2009)

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Thu, 18 Jun 2009 20:21:37 -0400
To: Joe D Williams <joedwil@earthlink.net>
CC: ietf-http-wg@w3.org, public-html@w3.org
Message-ID: <4A3ADA11.9020901@mit.edu>
Joe D Williams wrote:
> Given that I am looking at it from an author side, how about if the 
> current browsing context or whatever higher order process does the 
> top-level sniffing before it goes to the renderer

To which renderer?

> is called the UA, 
> which contains a nested browsing context.<video> or <audio> element. So 
> in these cases I will call the browser the host UA and the <video> 
> element the home of the handler which actually does the rendering.

OK...  I'm not sure how you're separating these, exactly, but under hte 
only reasonable split that comes to mind for me (the UA is what fetches 
the data while the handler is what actually renders it), you need to 
know what kind of data you have so you can select the right handler. 
Agreed, or not?  If not, is the problem one of disagreement on the 
definitions, or on the conclusion?

> I would like to say that data-sniffing is not needed. Can the target 
> files contain any security problems?

The target files (assuming you mean the <video> and <audio> data the 
server is sending) are always assumed to be malicious and trying to 
exploit the UA or the renderer to gain access to the user's system.

> I don't think you will find a MIME string in there unless specially 
> authored into the file.

Of course not.  Data sniffing doesn't look for MIME strings; it looks 
for initial bytes identifying the file type.  Many common file types 
must start with a particular byte sequence.

> The author should include both the mime and the target url in the 
> element.

Actually, no.  The author can just include the target url in the 
<video>.  If using a <source> element, a type hint (non-authoritative) 
can be provided; this is to be used only for deciding whether to load 
the <source> at all, not for what to do with it after loading it.

> If the type and file extension do not match the content model 
> for the element then fallback.

Uh... What content model?  And how did extension come in?  Extension is 
not a reliable indicator of type; the mapping from types to extensions 
is many-to-many...

> At this time it is (may not) not reliable to rely upon the 
> served ContentType so this cannot be used to reject the file. 

On the contrary, we should in fact be relying on the served 
Content-Type.  Anything else means the server has no control over how we 
handle the content, which can be a huge security problem for the server...

> Data-sniffing of file internal structures by the UA could take place, 
> but I don't think this would give the "standard" handler any additional 
> info.

Sure it would.  It'd tell it which of a number of container formats or 
codecs is being used, if done correctly.

> Literally, the handler should be able to handle any .wav or .ogg 
> spec files.

That's not very reasonable for .wav (see the earlier mention in this 
thread of the number of diffever WAVE codecs out there).  And it's 
completely unreasonable for .ogg, since that's just a container 
indicator.  Anyone can make up a new codec and stick it in a .ogg.

> The handler may need to do some data-sniffing to get set up. 
> but all it needs from the UA is the content.

I think you're drawing the line between "handler" and "UA" very 
arbitrarily here... and I can't tell exactly where you're doing it.  Can 
we please define _very_ clearly where the line is being drawn?

> Again, the load on the author should be no more than getting the type 
> and file name correct for the element and it should just work.

The load on the author should be pointing to the right URI; after that 
it should Just Work assuming the server and UA don't screw up.

>> 2) Sniffing the file extension instead of the content significantly 
>> complicates server-side solutions that, say, send different content 
>> based on query params in the URL.  Of course these should arguably 
>> send a Content-Type header anyway.
> 
> The file is there with the right extension but maybe I can't get control 
> of the server

Sorry, then you have a buggy server.

> and its defaults are not sufficient for the UA sniffer 
> (like maybe I can't use htaccess).

Again, sounds like you have a buggy server.

> Is my document success going to 
> depend on the fact that the server must be configured to send the 
> specified MIME? 

Yes.  It generally is already; why should this case be different?

> If it is served wrong or even absent a MIME am I going 
> to be saved by the UA data-sniffing to figure out that the misserved 
> file is really OK?

Under the current proposal, no.  If it's served wrong or with no MIME 
type, you will get no video.  Then you will realize that you need to fix 
your server setup.

>> say, send different content based on query params in the URL.
> 
> Are we going to see this with these elements?

I don't see why not.  We see it all the time now with objects, iframes, 
etc...

> We have already negotiated 
> the content model down to a simple set. Many times those query forms are 
> built from scripts that have determined or are determining some client 
> capabilites?

No, often enough they're just built from web pages that give the user a 
list of choices and ask him to pick one, then just have one script on 
the server that reads the data from somewhere (database, filesystem, 
whatever) and returns it.

> The application and content types of these elements seems 
> quite clear as defined by the kestrokes that actually appear in the 
> element code at run time. .

I have no idea what you're trying to say here.

-Boris
Received on Friday, 19 June 2009 00:22:18 UTC