[LLD] Notes from Metadata for data discovery and dataset description using SPARQL

I am forwarding the notes from last week's LLD metadata teleconference. I
will send the reminder for today's meeting at 11AM ET / 5PM CET from the
HCLS calendar shortly. Michel and Alasdair will present a model for
provenance, versioning, format and availability as the main agenda item.
Michel will send a fuzebox invitation to the lists for info on how to join.

Thanks to Alasdair for scribing last Tuesday!

Cheers,
Scott

Paste of notes (pdf version attached):

Michel and Alasdair have model proposal for a dataset version and
formatting: thrashing out the final details and will present as the first
item at the next call.

Focus on provenance and origin:

Version: literal
Dates: Full date time with timezone
Modified by: URI for a person
Source: Needs to be present and specify the version/date of the source
Prior version: point to URI
Superseding version: Maintenance issue; can be inferred; provenance
vocabularies
only point backwards
Subset/superset: only assert that you are a subset of another one
Frequency of change: estimate; use a URI for the value, e.g. dublin core
Latency of change: time it takes for changes in the raw data to appear in
the derived
dataset. Very specialised and probably not to be included
created with: realised we need to point to a tool that was used to generate
the dataset;
particularly for D2R or Bio2RDF point to versions of scripts

Aggregation of datasets is covered by source/derivation: simply include
multiple sources

Availability

Availability: raises a maintenance issue; could capture available until X
if it is known that
it is no longer going to be available. Good for registry use case, but not
necessarily for
data publishing. It is a monitoring property.
Publisher: need to decide a value set: literal/URI
Format: mime type of the file, not the vocabularies used; EDAM, biosharing
as
candidates
data item HTML template: to automate access
RDF dump: available in multiple formats
SPARQL endpoint
API: point to a top level page about the API rather than each individual
method
Catlog/registry: point to records in registries. Inverse relationship with
the registries

---------- Forwarded message ----------
From: Alasdair J G Gray <Alasdair.Gray@manchester.ac.uk>
Date: Mon, Apr 29, 2013 at 1:39 PM
Subject: Re: Notes from last meeting?
To: "M. Scott Marshall" <mscottmarshall@gmail.com>
Cc: Michel Dumontier <michel.dumontier@gmail.com>


Hi Scott,

Here are my notes from the last call.

We will indeed start the call off today with a model for describing
datasets. Hopefully it will clear up the resources that we need to describe
and thus the scope of many of the properties that we are discussing.

Alasdair


On 29 Apr 2013, at 12:29, "M. Scott Marshall" <mscottmarshall@gmail.com>
wrote:

Hello Alasdair, Michel,

Would you please send your notes from last week's metadata meeting? I am
about to send out a reminder for today's meeting.

Michel - if you send me the fuzebox invite, I can add it into the reminder.

As I recall from last week, you guys will be presenting the data model
today as the main item on the agenda.

Cheers,
Scott

-- 
M. Scott Marshall, PhD
MAASTRO clinic, http://www.maastro.nl/en/1/
http://eurecaproject.eu/
https://plus.google.com/u/0/114642613065018821852/posts
http://www.linkedin.com/pub/m-scott-marshall/5/464/a22


Dr Alasdair J G Gray
Research Associate
Alasdair.Gray@manchester.ac.uk
+44 161 275 0145

http://www.cs.man.ac.uk/~graya/

Please consider the environment before printing this email.





-- 
M. Scott Marshall, PhD
MAASTRO clinic, http://www.maastro.nl/en/1/
http://eurecaproject.eu/
https://plus.google.com/u/0/114642613065018821852/posts
http://www.linkedin.com/pub/m-scott-marshall/5/464/a22

Received on Monday, 29 April 2013 12:24:41 UTC