RDF data archives from Michel Dumontier on 2013-12-06 (public-semweb-lifesci@w3.org from December 2013)

From: Michel Dumontier <michel.dumontier@gmail.com>
Date: Thu, 5 Dec 2013 16:05:38 -0800
To: w3c semweb hcls <public-semweb-lifesci@w3.org>, "public-lod@w3.org" <public-lod@w3.org>, SWIG Web <semantic-web@w3.org>, bio2rdf <bio2rdf@googlegroups.com>
Message-ID: <CALcEXf6djw_JkEd-ZYhKSpd42_+ABM9XAhAfS793fBB90jtPQA@mail.gmail.com>

Hi all,
 As you may know, Bio2RDF produces RDF dumps of its RDF datasets [1,2]. For
each dataset, we generate a dataset description file (as per [3]; example
[4]) that is in n-triples format, while the dataset is comprised of one or
more *gzipped* n-triple files. I just noticed that LODStats did not
correctly parse [5] these files to generate the dataset statistics, owing,
perhaps, to the assignment of "application/x-ntriples" in the relevant
datahub.io resource metadata.

I'd like to know what mime type we should specify for zipped, gzipped RDF
data.

as we prepare for our next release, we're planning to generate n-quads for
the datasets, thereby linking versioned datasets with their metadata. we
are wondering whether there will be sufficient support for this format.
Also, we are wondering whether it would be problematic to provide single
file downloads that are tar.gz  formatted.

comments and suggestions most welcome,

m.


[1] http://bio2rdf.org/datasets
[2] http://download.bio2rdf.org/
[3]
https://github.com/bio2rdf/bio2rdf-scripts/wiki/Bio2RDF-Dataset-Provenance
[4]
http://download.bio2rdf.org/current/affymetrix/bio2rdf-affymetrix-20121004.nt
[5] http://stats.lod2.eu/rdfdocs?search=bio2rdf

-- 
Michel Dumontier
Associate Professor of Medicine (Biomedical Informatics), Stanford
University
Chair, W3C Semantic Web for Health Care and the Life Sciences Interest Group
http://dumontierlab.com

Received on Friday, 6 December 2013 00:06:27 UTC