[whatwg] <object> behavior from Boris Zbarsky on 2009-09-15 (public-whatwg-archive@w3.org from September 2009)

From: Boris Zbarsky <bzbarsky@MIT.EDU>
Date: Tue, 15 Sep 2009 08:53:40 -0400
Message-ID: <4AAF8E54.7090909@mit.edu>
Ian Hickson wrote:
>> Since the whole point of text/plain sniffing is a workaround around a 
>> known issue where content is reliably mis-marked as text/plain, and 
>> since in this case there is a source of MIME information that's more 
>> reliable than that, it's not clear to me why we want to continue 
>> sniffing.
>>
>> Of course if there is no @type there is no problem; I'm specifically 
>> concerned about the @type="text/plain" case here.
> 
> What exactly are you proposing here?
> 
>  - Always honour type="" if it's a UA-supported type, ignoring server- 
>    provided content-type?
>  - Always honour type="" without sniffing if it matches the server- 
>    provided content-type, even if normally that type would be sniffed?
>  - Just honour type="text/plain" regardless of the server type, but for
>    other UA-supported type=""s, use the server type?

My suggestion is to only perform text/plain "is this text or binary" 
sniffing where it belongs: on the HTTP level; since it's a workaround 
for a particular HTTP server bug.  It shouldn't affect other type metadata.

Perform the sniffing such that it detects as either text/plain or 
application/octet-stream.

Then if it's application/octet-stream we'll end up using the @type. 
Though see below on other sniffing issues.

This does fail to sniff text/plain as the various "non-scriptable" 
types, but I question how desirable that is anyway, honestly.  If we 
want to preserve this property without clobbering @type="text/plain" 
then I need to think a bit more about how to specify the behavior here.

Maybe your option 2 is what would give that behavior... I can work 
through it if you'd like.

Your option 1 would be ok if that's what we want (but a change from 
HTML4 and what UAs at least _try_ to implement now; I'm not sure whether 
it's desirable on its own).  Your option 3 is a bit too magic for 
text/plain in @type; unnecessarily so unless we want to go the full 
option 1 route.  All in my opinion, of course.

>> My concern about text/plain data being sniffed as text/html by your 
>> current algorithm (even with the changes you've made) seems to remain 
>> unaddressed.
> 
> I thought I had. Can you walk me through how anything labeled text/plain 
> could get sniffed as text/html with the new text?

Hmm.  Assume the type attribute is not set and HTML data is sent as 
text/plain and contains a "binary byte" in the first 512 bytes (can just 
stick it in the <title> or something).  Also assume no plug-in claims to 
support the URI's file extension.

At step 3, the resource type is set to text/plain.

At step 4, the resource type is sniffed as application/octet-stream, 
since text/html is marked as scriptable in [MIMESNIFFF].

At step 5, there is no @type, and the resource type is 
application/octet-stream, so the resource type is changed to unknown.

At step 6, nothing changes since there is no plug-in supporting the 
URI's file extension.

At step 7, the resource type is "unknown", so it is changed to the 
"sniffed type of the resource".

Maybe I simply misunderstood this last reference, by way of contrasting 
it with what step 4 says and you mean to apply the full sniffing 
algorithm, including the special-cases for text/plain, and not just 
section 5 of [MIMESNIFF].  In that case there wouldn't be a problem (the 
data would get sniffed as application/octet-stream).  That wasn't quite 
clear, but I can see now that this is probably what you meant.

-Boris
Received on Tuesday, 15 September 2009 05:53:40 UTC