Re: [dxwg] How to express distributions provided as compressed files

@nicholascar I actually these there are 2 separate issues and their combination, but even in #54 they are a bit mixed up. 
1. Pure compression, e.g. I a have an RDF TriG Distribution - one `.trig` file, and and I compress it to save space on my web server, creating a `.trig.gz` file and I want to be able to describe this distribution properly. This can be done in multiple ways combining web server techniques and DCAT. One way could be to serve the `.trig.gz` file directly, then I need to be able to say in DCAT that the distribution is RDF TriG with Media type `application/trig` and it is compressed using gzip with Media type `application/gzip`. There is the Media type extension `+zip` which is however not specific enough (zip and gzip are two different things). Another way of doing this is saying that we leave compression to HTTP server and client (restriction to HTTP), the server can use `gzip_static` to serve the `.trig.gz` file from its file system and the client decompresses it transparently in the HTTP layer. This means the Distribution still points to the `.trig` file, the media type is `application/trig`, and it is completely opaque to the user.
2. Packaging of files (like `tar` does). This is a separate use case where a set of files (homogeneous or heterogeneous) is packaged into one. The question here is whether we recommend this or not, and if we do, how do we describe what is inside. Again, one way would be to say that we only recommend homogeneous packages (e.g. a set of `.xml` files valid against a single XSD), and provide properties for saying that the file inside is `application/xml` and the package is TAR (there is no official Media type for that, unofficially `application/x-tar', and there is a [file type](http://publications.europa.eu/mdr/resource/authority/file-type/html/filetypes-eng.html) for it). I would disallow (not recommend) having a package of heterogeneous files as one distribution, and recommend to split them into multiple distributions, so that each can be described properly.
3. Combination of these two. There should be guidance for this, e.g. a `.tar.gz` file containing a set of XML files. There we need to be able to describe that  they are XML files conformant to an XSD schema, packaged using TAR and compressed using gzip.


-- 
GitHub Notification of comment by jakubklimek
Please view or discuss this issue at https://github.com/w3c/dxwg/issues/259#issuecomment-399844226 using your GitHub account

Received on Monday, 25 June 2018 06:22:26 UTC