W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > December 2013

RDF data archives

From: Michel Dumontier <michel.dumontier@gmail.com>
Date: Thu, 5 Dec 2013 16:05:38 -0800
Message-ID: <CALcEXf6djw_JkEd-ZYhKSpd42_+ABM9XAhAfS793fBB90jtPQA@mail.gmail.com>
To: w3c semweb hcls <public-semweb-lifesci@w3.org>, "public-lod@w3.org" <public-lod@w3.org>, SWIG Web <semantic-web@w3.org>, bio2rdf <bio2rdf@googlegroups.com>
Hi all,
 As you may know, Bio2RDF produces RDF dumps of its RDF datasets [1,2]. For
each dataset, we generate a dataset description file (as per [3]; example
[4]) that is in n-triples format, while the dataset is comprised of one or
more *gzipped* n-triple files. I just noticed that LODStats did not
correctly parse [5] these files to generate the dataset statistics, owing,
perhaps, to the assignment of "application/x-ntriples" in the relevant
datahub.io resource metadata.

I'd like to know what mime type we should specify for zipped, gzipped RDF
data.

as we prepare for our next release, we're planning to generate n-quads for
the datasets, thereby linking versioned datasets with their metadata. we
are wondering whether there will be sufficient support for this format.
Also, we are wondering whether it would be problematic to provide single
file downloads that are tar.gz  formatted.

comments and suggestions most welcome,

m.


[1] http://bio2rdf.org/datasets
[2] http://download.bio2rdf.org/
[3]
https://github.com/bio2rdf/bio2rdf-scripts/wiki/Bio2RDF-Dataset-Provenance
[4]
http://download.bio2rdf.org/current/affymetrix/bio2rdf-affymetrix-20121004.nt
[5] http://stats.lod2.eu/rdfdocs?search=bio2rdf

-- 
Michel Dumontier
Associate Professor of Medicine (Biomedical Informatics), Stanford
University
Chair, W3C Semantic Web for Health Care and the Life Sciences Interest Group
http://dumontierlab.com
Received on Friday, 6 December 2013 00:06:27 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:21:36 UTC