- From: Alex Deymo <deymo@google.com>
- Date: Fri, 21 Aug 2020 16:36:46 +0200
- Cc: HTTP Working Group <ietf-http-wg@w3.org>
- Message-ID: <CAGd9gwjWriKCRNNDkjfx0ME0L8v5qT3mO=6X1U2tDNYyXLtZ9w@mail.gmail.com>
Le ven. 21 août 2020 à 14:27, Julian Reschke <julian.reschke@gmx.de> a écrit : > > However, on top of that, the lossless recompression of JPEG files allows > > you to get this ~20% gain for existing files. When you deploy a new > > lossy codec there is the question of what to do with the existing > > images. If you have a website with photos and want to convert your > > already lossy JPEG files to a new codec to save storage and bandwidth > > and you decide to decode them to pixels and encode them back to the new > > format you will end up with more artifacts or worse compression density > > trying to accurately represent the JPEG artifacts in the new codec, > > whatever the codec is. It's impractical to do this lossy transcoding to > > a new codec at large scale on existing images, each application would > > need to evaluate whether they want to do this for existing images. This > > story is different if you start with a large and high quality image > > (like a JPEG from a camera) and want to encode in a smaller form for the > > web, since there you already have a high quality file. > > That makes it sound a bit as if a losslessly-re-encoded JPG file is not > a valid JXL file. Is that the case? > A losslessly recompressed JPEG is a valid JXL file. There's value in conserving your JPEG files as lossless recompressed versions outside the Content-Encoding world (like, converting your existing library in your hard-drive). What I meant here is that if you start with a large high quality image (JPEG or RAW) and you encoded it in the past long time ago to a lower resolution or lower quality JPEG for the web application, you introduced certain specific JPEG-artifacts and discarded information about the original file. In some sense, the damage to the image is done. If you already did this, then you are limited in your options on how to further compress this file because you don't know what the original file looked like so you might be trying to accurately reproduce JPEG artifacts with a new codec instead of accurately reproduce original image features, this is where lossless recompression is a good idea. Instead, if you still have the original file, you can produce a lower quality or lower resolution JXL that's visually similar to the original file (not visually similar to the low res JPEG in the previous case). This would give you a better compression ratio for the visual quality (but it would not give you a JPEG file right away). What's not true is the opposite statement, and maybe that's where the confusion is. Not every JXL is a losslessly-re-encoded JPEG, although you can always do stuff like decode any JXL to pixels and encode it back to JPEG but it would largely depend on how you encode back to JPEG what file you end up with. The lossless recompression feature limits the options when encoding the JXL and adds extra information to be able to deterministically produce a certain JPEG file. > ... > > I think the only shocking thing about a content-encoding for JPEGs is > > that it can't encode any arbitrary file only JPEGs, but if you look at > > "general purpose" compressors like Brotli they still can't compress to a > > smaller file every file; many binary files that are already compressed > > like .zip or even a JPEG files (unless they have a large ICC) won't > > compress to a smaller file so you just don't do it even if Brotli is > > able to compress them to a ~similar size file. > > ... > > That's indeed a concern. For the other currently registered encodings, > you *can* apply them, but they do not necessarily help. > > This one can't be applied to any file type. One way to address this > would be to tune the format that it *can* handle any file type (by just > adding a tiny wrapper around it and preserving the actual octet stream > within). Yes you could add a tiny frame around to tell whether this was lossless recompressed or not (maybe paying ~1 more byte), but isn't this basically what the Content-Encoding header in the response is for anyway? I don't see an application where this frame would help, the server side is not forced to use the content-encoding and sending a file wrapped into another format that adds no benefit would be a bit of a waste: 1. If we don't have this frame, you can call the function to do the lossless encoding, if it returns with an error (like if the file is not a JPEG) then you don't set Content-Encoding to jxl. 2. If we do have this frame, you always set the content encoding as jxl, and then the function that would do the encoding does exactly the same logic but stores the "jxl or raw" bit of information in the first byte depending on whether it was able to re-encode it. There's very little difference in how much you can already send to the client before the encoding is done in either case and in general you know very quickly whether the file can be encoded or not. Maybe all we need is a function to tell very quickly whether we *can* encode it. I think this is possible and relatively easy. But I understand that this limitation may need changes in how your server integrates a new content encoding since it is not the same way that brotli for example was integrated; this is something that can be addressed at the time of implementing support for this content encoding in your server-side software. My idea of how this would be implemented is more along the lines of already having the jxl lossless file for static content and just serving it on request or decoding+serving for clients not supporting it, given that you get a significant benefit in storage size of static content (similar to brotli_static setting in Nginx brotli). That said, I should probably mention that according to the spec draft a valid JPEG-1 file is also a valid JXL file, so it is really the non-JPEG files that you can't re-encode and that part you can tell by looking at the first few bytes, so we already have this frame information for old JPEG1 vs JXL file.
Received on Friday, 21 August 2020 14:38:03 UTC