Re: [whatwg] How to determine content-type of file: protocol from Gordon P. Hemsley on 2014-07-27 (public-whatwg-archive@w3.org from July 2014)

From: Gordon P. Hemsley <me@gphemsley.org>
Date: Sun, 27 Jul 2014 18:34:48 -0400
To: duanyao <duanyao@ustc.edu>, whatwg@whatwg.org
Message-ID: <53D57E88.1030108@gphemsley.org>
Sorry for the delay in responding. Your message fell through the cracks 
in my e-mail filters.

On 07/17/2014 08:26 AM, duanyao wrote:
> Hi,
>
> My first question is about a rule in MIME Sniffing specification (http://mimesniff.spec.whatwg.org):
>
>     5.1 Interpreting the resource metadata
>     ...
>     If the resource is retrieved directly from the file system, set supplied-type to the MIME type
>     provided by the file system.
>
> As far as I know, no main-stream file systems record MIME type for files. Does the spec actually want to say "provided by the operating system" or
> "provided by the file name extension"?

Yeah, you've hit a known (though apparently unrecorded) bug in the spec, 
originally pointed out to me by Boris Zbarsky via IRC many months ago. 
The intent here is basically just "whatever the computer says it 
is"—whether that be via the file system, the operating system, or 
whatever, and whether it uses magic bytes, file extensions, or whatever.

In other words, feel free to read that as "the correct behavior is 
undefined/unknown" at this point.

> My second question is: does above rule apply equally to both fetching static resources (top level, iframe, img, etc) and XMLHttpRequest?
>
> It seems all browsers try to figure out actual type for local static resources, so that .htm and .xhtml files are rendered as HTML and XHTML respectively,
> so far so good.
>
> But when it comes to XHR, things are different.
>
> Firefox(31) set Content-Type header to 'application/xml' for local files of any type; and if setting xhr.responseType = 'document', response is parsed as XML;
> also if setting xhr.responseType = 'blob', blob.type is always 'application/xml'. This is significantly diverse from static fetching behavior.
>
> Chromium(34) set Content-Type header to null for local files of any type; but if setting xhr.responseType = 'document', response is parsed according to its actual type,
> i.e. .htm as HTML and .xhtml as XHTML; and if setting xhr.responseType = 'blob', blob.type is the file's actual type, i.e. 'text/html' for .htm and 'application/xhtml+xml'
> for .xhtml. This is similar to static fetching behavior, however Content-Type header is missing.
>
> I think rule 5.1 should be applied to both static fetching and XHR consistently. Browsers should set Content-Type header to local files' actual type for XHR, and interpret
> them accordingly. But firefox developers think this would break some existing codes that already rely on firefox's behavior
> (see https://bugzilla.mozilla.org/show_bug.cgi?id=1037762).
>
> What do you think?
>
> Regards,
>      Duan Yao.
>
>

Anne's the person to ask about XHR first, I think. I don't want to make 
any judgements or claims until I hear his view on the situation.

That being said, I created the Contexts wiki article [1] and began 
splitting up the mimesniff spec according to contexts [2] in an effort 
to clarify this situation and make sure that all bases were covered. 
It's still a work in progress, awaiting feedback from implementers and 
other spec writers.

I agree that there's a hole in how mimesniff, XHR, and Contexts 
intersect, and I'll be happy to update mimesniff to fill it, if that's 
determined to be the best course of action.

HTH,
Gordon

[1] http://wiki.whatwg.org/wiki/Contexts
[2] http://mimesniff.spec.whatwg.org/#context-specific-sniffing

-- 
Gordon P. Hemsley
me@gphemsley.org
http://gphemsley.org/
Received on Sunday, 27 July 2014 22:35:15 UTC