Invitation: Linked Life Data @ Weekly from 11:00 to 12:00 on Monday (HCLS)

You have been invited to the following event.

Title: Linked Life Data
[Cancelling May 6 due to Bank Holiday. Suggest moving to next week, May 13.  
Many biohackathon participants will be very busy during the next two days  
for a deadline.]

This meeting will be held using fuze: please join at **Michel will send  
fuzebox info**

Note: There have been various small hiccups with fuzebox so I recommend  
that you join ahead of time if you aren't sure about fuzebox working  
properly. I recently discovered that you will get *no message* if there is  
no Internet connectivity - it will simply do nothing when you start the  
local fuzebox client.

Duration: ~1 hour
(Variable) frequency: ~weekly
Convener: M. Scott Marshall

Session Theme: Metadata for data discovery and dataset description using  
SPARQL

* revisit data model for provenance, versioning, format and availability -  
Michel, Alasdair
   Relevant docs:
   the abstraction
   -  
https://docs.google.com/drawings/d/1e6qsxPkc-qKecVTJGJePE1Nuy2sD8Puu-FsEtUtGC-o/edit?disco=AAAAAFWHcnw
   sample implementation using chembl:
   - https://docs.google.com/file/d/0B4y0zfdRviKsS1l2NEttN3pfc1k/edit
* Remaining metadata attributes in working draft - All (time allowing)
    - Working draft:
    -  
https://docs.google.com/spreadsheet/ccc?key=0Aoy0zfdRviKsdFJWTDFpblNXc3BtelhrdEpNYTdvbXc#gid=1
* AOB

**********************************************************************
Notes from Mon. Apr. 29 (Thanks Alasdair!):

Michel presented the dataset description model for unversioned ->  
versioned -> formatted -> data

Overview: https://docs.google.com/drawings/d/1e6qsxPkc-
qKecVTJGJePE1Nuy2sD8Puu-FsEtUtGC-o/edit?disco=AAAAAFWHcnw
Detailed: https://docs.google.com/file/d/0B4y0zfdRviKsS1l2NEttN3pfc1k/edit?
usp=sharing

Only an RDF data item would be able to point back to its description due to  
the allowances of the data model.

Discussion on whether to repeat metadata at all levels. Each description  
would be complete: simplifies query; redundancy in metadata. Flexibility of  
the URIs for each level means you have several entry points. Could have  
contradictions in the metadata at the different levels. Idea would be to  
limit the metadata in the unversioned part to minimal data that would not  
change over time.

Abstract data format, e.g. triplestore. One option would be to model it as  
a separate
versioned/formatted dataset description. However there is an underlying  
formatted dataset that has been loaded into the the underlying datastore  
and this is what would be described. For a relational database accessible  
through D2R the description would be a SQL versioned dataset with an  
accessibility protocol of SPARQL. Service points to the versioned/formatted  
dataset that they expose.

Different syntaxes provide different views of the data. In RDF you can  
capture the relationship between the data and the metadata. This is not  
generally possible in other syntaxes.

Sources: points to the exact file that was used so that bugs can be tracked

MIRIAM is a catalog that described datasets. Catalog was added.

Format types would be captured with URIs.
EDAM are amenable to extending to cover the file type that we require, Nick  
has been in contact with Jon Ison. http://edamontology.org/page

ToDo: Go through different scenarios, e.g. Bio2RDF, Open PHACTS, MIRIAM and  
see how these look in the model (not using any particular vocabularies). By  
hand generate a full description for a dataset.

ToDo: Revisit properties in the spreadsheet to ensure that they are all  
still required.

Do send questions and comments to the list!
When: Weekly from 11:00 to 12:00 on Monday Eastern Time
Where: #hcls
Calendar: HCLS
Who:
     * w3.hcls@gmail.com - organizer
     * public-semweb-lifesci@w3.org
     * dbcatalog

Event details:  
https://www.google.com/calendar/event?action=VIEW&eid=Z3Y5ODd0aGRidDRmZ2RmbzZucWpuN2F2MW8gcHVibGljLXNlbXdlYi1saWZlc2NpQHczLm9yZw&tok=MTcjdzMuaGNsc0BnbWFpbC5jb20xN2M2YWE2NWRkMmE4NDFkNDEwNWFlZWE2NjY0MTJhYzI0Y2RmZDQw&ctz=America/New_York&hl=en

Invitation from Google Calendar: https://www.google.com/calendar/

You are receiving this courtesy email at the account  
public-semweb-lifesci@w3.org because you are an attendee of this event.

To stop receiving future notifications for this event, decline this event.  
Alternatively you can sign up for a Google account at  
https://www.google.com/calendar/ and control your notification settings for  
your entire calendar.

Received on Monday, 6 May 2013 09:36:42 UTC