- From: Boris Zbarsky <bzbarsky@MIT.EDU>
- Date: Tue, 15 Sep 2009 08:53:40 -0400
Ian Hickson wrote: >> Since the whole point of text/plain sniffing is a workaround around a >> known issue where content is reliably mis-marked as text/plain, and >> since in this case there is a source of MIME information that's more >> reliable than that, it's not clear to me why we want to continue >> sniffing. >> >> Of course if there is no @type there is no problem; I'm specifically >> concerned about the @type="text/plain" case here. > > What exactly are you proposing here? > > - Always honour type="" if it's a UA-supported type, ignoring server- > provided content-type? > - Always honour type="" without sniffing if it matches the server- > provided content-type, even if normally that type would be sniffed? > - Just honour type="text/plain" regardless of the server type, but for > other UA-supported type=""s, use the server type? My suggestion is to only perform text/plain "is this text or binary" sniffing where it belongs: on the HTTP level; since it's a workaround for a particular HTTP server bug. It shouldn't affect other type metadata. Perform the sniffing such that it detects as either text/plain or application/octet-stream. Then if it's application/octet-stream we'll end up using the @type. Though see below on other sniffing issues. This does fail to sniff text/plain as the various "non-scriptable" types, but I question how desirable that is anyway, honestly. If we want to preserve this property without clobbering @type="text/plain" then I need to think a bit more about how to specify the behavior here. Maybe your option 2 is what would give that behavior... I can work through it if you'd like. Your option 1 would be ok if that's what we want (but a change from HTML4 and what UAs at least _try_ to implement now; I'm not sure whether it's desirable on its own). Your option 3 is a bit too magic for text/plain in @type; unnecessarily so unless we want to go the full option 1 route. All in my opinion, of course. >> My concern about text/plain data being sniffed as text/html by your >> current algorithm (even with the changes you've made) seems to remain >> unaddressed. > > I thought I had. Can you walk me through how anything labeled text/plain > could get sniffed as text/html with the new text? Hmm. Assume the type attribute is not set and HTML data is sent as text/plain and contains a "binary byte" in the first 512 bytes (can just stick it in the <title> or something). Also assume no plug-in claims to support the URI's file extension. At step 3, the resource type is set to text/plain. At step 4, the resource type is sniffed as application/octet-stream, since text/html is marked as scriptable in [MIMESNIFFF]. At step 5, there is no @type, and the resource type is application/octet-stream, so the resource type is changed to unknown. At step 6, nothing changes since there is no plug-in supporting the URI's file extension. At step 7, the resource type is "unknown", so it is changed to the "sniffed type of the resource". Maybe I simply misunderstood this last reference, by way of contrasting it with what step 4 says and you mean to apply the full sniffing algorithm, including the special-cases for text/plain, and not just section 5 of [MIMESNIFF]. In that case there wouldn't be a problem (the data would get sniffed as application/octet-stream). That wasn't quite clear, but I can see now that this is probably what you meant. -Boris
Received on Tuesday, 15 September 2009 05:53:40 UTC