Re: Graph metadata example [was: Fwd: RDF data archives] from Yves Raimond on 2013-12-07 (public-rdf-wg@w3.org from December 2013)

From: Yves Raimond <yves.raimond@bbc.co.uk>
Date: Sat, 7 Dec 2013 11:54:57 +0000
To: Pat Hayes <phayes@ihmc.us>
CC: Guus Schreiber <guus.schreiber@vu.nl>, RDF WG <public-rdf-wg@w3.org>
Message-ID: <1386417296.28774.51.camel@RD000190>
On Sat, 2013-12-07 at 01:05 -0600, Pat Hayes wrote:
> On Dec 5, 2013, at 7:13 PM, Guus Schreiber <guus.schreiber@vu.nl> wrote:
> 
> > Just to put the discussion about the one graph metadata triple in perspective:
> > 
> > * Check out reference 3 in the message below [1] (sent today by a W3C chair).
> 
> ? I don't see any references there, it doesn't look like a message (??)
> > 
> > * Also, check the Prov bundle example in Sec. 4.2.3 of the provenance book written by the Prov chairs [2]. BTW Our response to the Prov WG a year ago is here [23.
> 
> That is fine. What they say is "we adopt the convention that the identifier of a bundle ... can be dereferenced to obtain a representation of the bundle." That is, they are using Web machinery of dereferencing to link a 'graph name' to the graph it names, *not* the dataset name-graph pairing convention. That is perfectly compatible with our specs. We could describe this convention as: "bundle identifiers denote what they conventionally identify in HTTP."
> 

If that is fine (and indeed that is the use-case which our example
addresses), then could we add a couple of sentences that makes it
explicit that dereferencing http://example.org/bob gives back the four
triples in the example? I thought section 1 made it explicit, but
perhaps we need to restate it alongside the multigraphs example.

Yves

> > This is simply to point out that ignoring practice is not going to help.
> 
> The primer, of necessity, will ignore a large swathe of RDF practice simply because it is a short document. I see no reason why it is obliged to focus attention on this particular (and particularly troublesome) case. 
> 
> Pat
> 
> 
> > An example *with proper caveats* might help a little bit.  Proper references to the Dataset Note will also likely be helpful.
> > 
> > Guus
> > 
> > [1] https://github.com/bio2rdf/bio2rdf-scripts/wiki/Bio2RDF-Dataset-Provenance

> > [2] http://books.google.nl/books?id=8aBeAQAAQBAJ&pg=PT72&lpg=PT72&dq=provenance+trig&source=bl&ots=eardCUYmGt&sig=4EA4vZlWoXR-2o3CbEJwzDEPC4U&hl=en&sa=X&ei=CvqgUub1JOqf0QXSxYGACw&ved=0CFAQ6AEwBA#v=onepage&q=provenance%20trig&f=false

> > [3]  http://lists.w3.org/Archives/Public/public-rdf-wg/2012Oct/0208.html

> > 
> > -------- Original Message --------
> > Subject:  RDF data archives
> > Resent-Date:  Fri, 6 Dec 2013 00:06:33 +0000
> > Resent-From:  <semantic-web@w3.org>
> > Date:  Thu, 5 Dec 2013 16:05:38 -0800
> > From:  Michel Dumontier <michel.dumontier@gmail.com>
> > To:  w3c semweb hcls <public-semweb-lifesci@w3.org>, "public-lod@w3.org"
> > <public-lod@w3.org>, SWIG Web <semantic-web@w3.org>, bio2rdf
> > <bio2rdf@googlegroups.com>
> > 
> > 
> > 
> > Hi all,
> >  As you may know, Bio2RDF produces RDF dumps of its RDF datasets [1,2].
> > For each dataset, we generate a dataset description file (as per [3];
> > example [4]) that is in n-triples format, while the dataset is comprised
> > of one or more *gzipped* n-triple files. I just noticed that LODStats
> > did not correctly parse [5] these files to generate the dataset
> > statistics, owing, perhaps, to the assignment of
> > "application/x-ntriples" in the relevant datahub.io <http://datahub.io>
> > resource metadata.
> > I'd like to know what mime type we should specify for zipped, gzipped
> > RDF data.
> > 
> > as we prepare for our next release, we're planning to generate n-quads
> > for the datasets, thereby linking versioned datasets with their
> > metadata. we are wondering whether there will be sufficient support for
> > this format. Also, we are wondering whether it would be problematic to
> > provide single file downloads that are tar.gz  formatted.
> > 
> > comments and suggestions most welcome,
> > 
> > m.
> > 
> > 
> > [1] http://bio2rdf.org/datasets

> > [2] http://download.bio2rdf.org/

> > [3]
> > https://github.com/bio2rdf/bio2rdf-scripts/wiki/Bio2RDF-Dataset-Provenance

> > [4]
> > http://download.bio2rdf.org/current/affymetrix/bio2rdf-affymetrix-20121004.nt

> > [5] http://stats.lod2.eu/rdfdocs?search=bio2rdf

> > 
> > -- 
> > Michel Dumontier
> > Associate Professor of Medicine (Biomedical Informatics), Stanford
> > University
> > Chair, W3C Semantic Web for Health Care and the Life Sciences Interest Group
> > http://dumontierlab.com

> > 
> > 
> > 
> > 
> 
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 home
> 40 South Alcaniz St.            (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile (preferred)
> phayes@ihmc.us       http://www.ihmc.us/users/phayes

> 
> 
> 
> 
> 
> 
> 



-----------------------------
http://www.bbc.co.uk

This e-mail (and any attachments) is confidential and 
may contain personal views which are not the views of the BBC unless specifically stated.
If you have received it in 
error, please delete it from your system.
Do not use, copy or disclose the 
information in any way nor act in reliance on it and notify the sender 
immediately.
Please note that the BBC monitors e-mails 
sent or received.
Further communication will signify your consent to 
this.
-----------------------------
Received on Saturday, 7 December 2013 11:55:30 UTC