Re: [svgwg] A separate MIME type for svgz files is needed (#701)

The issue isn't that Apache can't set content type and content encoding, the issue is that some web servers (I know for a fact that nginx is one such server) use content type to determine whether a file should be dynamically compressed or not, then it is served with the detected mime type plus the correct content encoding (and this part is working correctly), e.g. 

```
 gzip_types text/html application/javascript application/json application/xml+rss image/bmp image/svg+xml text/css text/javascript text/plain text/xml;
```

This directive says "when a resource with a mime type from this list is requested, apply a gzip transform to it, and serve the compressed content with the `Content-Encoding: gzip` header"

A plain Jane `.svg` file has a content type `image/svg+xml` which appears in the example `gzip_types` list above, so it is correctly compressed (significantly bringing down its size, given the byte-level duplication in text -- and particularly in xml -- files).

The problem is that this directive cannot distinguish between a request for `example.com/file.svg` and `example.com/file.svgz` because both have the same content type, so both will be gzipped on the fly, which *would* be OK if there were a separate content type the latter could be served with, as the dynamically compressed .svg will have the same `Content-Type` header as the dynamically compressed `.svgz` file (`Content-Encoding: gzip`). The end result is that the client in both cases receives a response with

```
...
Content-Encoding: gzip
Content-Type: image/svg+xml
...
```

and so has no way of knowing that the resulting file still needs to be decompressed again to actually be a valid SVG (and not SVGZ) file. 

The server should either serve a `.svgz` file as-is with `Content-Encoding: gzip` and `Content-Type: image/svg+xml` or it may (pointlessly) recompress it on-the-fly and serve it with `Content-Encoding: gzip` but then it needs to indicate that the response is not a text document but rather still gzip-encoded.

It wouldn't matter if applications with svg support could dynamically distinguish between a svgz and a svg file without having the correct extension (either by having a shared header that indicates the actual encoding, but that would mandate changes to the file format which is obviously never going to happen) or by simply falling back to trying to gzip deflate then attempt to once again decode as svg+xml if/when the initial decode-as-plain-svg step fails, but for example (@longsonr) Firefox won't decode an at-rest svgz file as svg as it doesn't attempt to decode it as gzip.

Ultimately, the problem is that .svgz files do not have magic header bits to tell whatever client application is opening them that they are gzip-compressed svg files, meaning that without an outwardly-visible indicator that is correctly preserved across transformations, they have no idea that they should decode it first. On Windows where there is no internal concept of mime types, the extension is used to make that distinction. In the web world, extensions have zero significance and the content-type header is used alone to make that decision, and unfortunately it fails in this case.

---
Note that I can configure nginx (via the `mime.types` file) to map requests to `.svgz` files to a different mime type than `image/svg+xml` which would stop it from dynamically compressing `.svgz` files but still have true `.svg` files compressed on-the-fly, but then the response will have `Content-Type: foo` instead of `Content-Type: image/svg+xml` because the same content type that is used to determine dynamic compression is also served to the client.

Personally, I don't *really* care as I'm fully in control of what types we serve. But please understand that this *isn't* a situation shared by any other file type and so comparisons with gzipped versions of other media types are not appropriate. `.svgz` isn't me (or whoever else falls prey to this) deciding on their own to gzip a regular svg file and then give it a `.svgz` extension rather than a `.svg.gz` extension, it's a regular person using a regular application option to save an SVG document into a format + extension that's been around for a long time, with no indication that this would cause problems in certain deployment scenarios.

It's also important to note that there are almost no drawbacks for adding a mime type here. Applications that ignore the mime type and use only the extension to determine how a file is opened will continue to do so. Applications that rely exclusively on the mime type will continue to fail open the file in this particular case (as it has *never* been possible to decode a svgz file based purely off the mime type without the content-encoding as well).

-- 
GitHub Notification of comment by mqudsi
Please view or discuss this issue at https://github.com/w3c/svgwg/issues/701#issuecomment-505642671 using your GitHub account

Received on Tuesday, 25 June 2019 22:13:23 UTC