Re: [w3c/manifest] Why does obtaining not check a MIME type? (#821)

@mgiuca and I took a look in HTTPArchive, specifically the `summary_requests.2020_03_01_mobile` table with the following queries:

```
SELECT resp_content_type, COUNT(resp_content_type) AS total
FROM [httparchive:summary_requests.2020_03_01_mobile]
WHERE url LIKE "%webmanifest"
GROUP BY resp_content_type ORDER BY total DESC
```

```
SELECT resp_content_type, COUNT(resp_content_type) AS total
FROM [httparchive:summary_requests.2020_03_01_mobile]
WHERE url LIKE "%manifest.json"
GROUP BY resp_content_type ORDER BY total DESC
```

Note that this is a bit deficient: it's only capturing manifests that are named `*manifest.json` or `*webmanifest`, but hopefully it's representative enough. We found the following:

- only 201,867 out of 281,507 total requests (71.7%) are served with a "JSON essence MIME type". The rest use MIME types that don't end in `+json` at all
- only 13,405 out of 281,507 total requests (4.8%) are served with the _correct_ MIME type of `application/manifest+json`
 - only 11,356 out of 68,784 total requests (16.5%) for a `*webmanifest` file are served with the _correct_ MIME type
- 28,484 out of 68,784 total requests (41.4%) for a `*webmanifest` file are served with _no_ MIME type whatsoever

As written, the proposal breaks 28.3% of manifests in the query, which is a lot. It's also interesting that over 80% of folks who are using the recommended `webmanifest` file extension do not have their web server configured to also send the recommended MIME type for that extension.

@mikewest, achieving good MIME type hygiene is a great goal, and I agree with the aim of locking down as many file types here as possible. However, I think nearly 30% breakage is very high from a spec and implementation perspective, and I stated other reasons for doubting whether a warning / deprecation would be practical in https://github.com/w3c/manifest/issues/821#issuecomment-609539499

Based on these numbers, I think changing the spec is impractical at the current time. I'd welcome other people doing some analysis to make sure I didn't miss anything.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3c/manifest/issues/821#issuecomment-618225986

Received on Thursday, 23 April 2020 07:18:29 UTC