W3C home > Mailing lists > Public > public-lod@w3.org > December 2012

[Linked Life Data] Announce: HCLS LLD task force W3C Note, Metadata policies, Monday 5PM CET

From: M. Scott Marshall <mscottmarshall@gmail.com>
Date: Fri, 7 Dec 2012 18:06:51 +0100
Message-ID: <CACHzV2Oea8bLY25bCEygcg5J=TDddqEk92zMaY2jhtNPJrimGg@mail.gmail.com>
To: "Pablo N. Mendes" <pablomendes@gmail.com>, Keith Alexander <keithalexander@keithalexander.co.uk>, "<public-lod@w3.org>" <public-lod@w3.org>, "biohackathon@googlegroups.com" <biohackathon@googlegroups.com>, CKAN discuss <ckan-discuss@lists.okfn.org>, "dbcatalog@googlegroups.com" <dbcatalog@googlegroups.com>, Michel Dumontier <michel.dumontier@gmail.com>, HCLS <public-semweb-lifesci@w3.org>, "Eric Prud'hommeaux" <eric@w3.org>, Jerven Bolleman <jerven.bolleman@isb-sib.ch>, Chris Bizer <chris@bizer.de>, Anja Jentzsch <anja@anjeve.de>
Metadata of Linked Open Drug Data = dbcatalog description as N3 =
namedgraph reflection

CKAN folks invited!

On Monday, Dec. 10 HCLS IG Linked Life Data task force will have a
teleconference at 11AM ET / 5PM CET. We will be ratifying
http://www.w3.org/2001/sw/hcls/notes/hcls-rdf-guide/ and discussing
computable data descriptions. Sorry for cross-posting but this is relevant
to all of the above lists.

Computable data descriptions
The need for a standardized approach to disclosing data descriptions about
RDF renderings of datasets (where the master copy is sometimes in another
form). All are welcome. Although our focus is on health care and life
science data, the initial topic will be about generic data description
metadata, so CKAN Datahub related.

Our trigger issue is the ongoing migration of linked data from fu-berlin to
UMannheim (still hoping to hear precise status and expected status from
Chris Bizer) and how to better synchronize aggregated metadata such as that
maintained by CKAN or LODD (predecessor of this task force) on the HCLS
wiki.

Due to the potentially expansive nature of the discussion (first one should
be short) and limited teleconference bridge capacity, please let me know if
you plan to attend. Feel free to continue the discussion that we've been
having on the hcls list.

Some propositions:
* Create tools that enable simplified "publishing" of datasets
(automatically add metadata triples to graph itself, with graphURI as
subject)
* Write synchronizer that updates aggregate indexes such as CKAN with
metadata from the graph

Test case:
How do I write a SPARQL query to retrieve all versions of DrugBank that
have an update frequency that is more frequent than yearly?

Note that there is no assumption of only one version of the RDF. There
could be tens of RDF versions of DrugBank, or more.

Teleconference Information:
Dial-In #: +1.617.761.6200 (Cambridge, MA) Participant Access Code: 4257
("HCLS") IRC Channel:
http://irc.w3.org<http://www.google.com/url?q=http%3A%2F%2Firc.w3.org&usd=2&usg=AFQjCNGyvJWqmBFYeGsEXwKaQZ3endepYA>port
6665 channel #HCLS

Agenda:
* ratification of IG Note
http://www.w3.org/2001/sw/hcls/notes/hcls-rdf-guide/ (10 min?) - All
* status of fu-berlin LODD data, actions? (10 min?) - Any!
* approach to create more up-to-date metadata (5 min?) - All
* well-defined standardized metadata to provide information about data
update frequency (10 min?)

-Scott

-- 
M. Scott Marshall, PhD
MAASTRO clinic, http://www.maastro.nl/en/1/
http://eurecaproject.eu/
https://plus.google.com/u/0/114642613065018821852/posts
http://www.linkedin.com/pub/m-scott-marshall/5/464/a22

On Thu, Nov 22, 2012 at 11:51 AM, Pablo N. Mendes
<pablomendes@gmail.com<https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=pablomendes@gmail.com>
> wrote:

>
> FYI, it has been the practice to put the metadata on a CKAN instance
> hosted at TheDataHub.org. There has been work to share the metadata as RDF:
> http://lod2.eu/BlogPost/1095-ckan-rdf-output.html
>
> We have also talked about adding more quality-related metadata (including
> how current the data is, more fine grained descriptions of what is in
> there, what links where, etc.) [1,2,3]. If you are putting together a
> consortium, you may want to try to channel some of that support towards the
> CKAN folks, so that changes can be made on the source (i.e. the catalog
> that holds all of the LOD cloud metadata).
>
> I am cross posting this message to CKAN-discuss to see if anybody picks up
> from there.
>
> Cheers,
> Pablo
>
> [1]
> http://webcache.googleusercontent.com/search?q=cache:cfLMuj9qNbMJ:wiki.ckan.org/Linked_Data+&cd=2&hl=en&ct=clnk
> [2] http://wiki.planet-data.eu/uploads/c/c0/D4.1.pdf
> [3] http://wiki.planet-data.eu/uploads/d/d7/D2.1.pdf
>
>
>
> On Wed, Nov 21, 2012 at 4:35 PM, M. Scott Marshall <
> mscottmarshall@gmail.com<https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=mscottmarshall@gmail.com>
> > wrote:
>
>> Thanks Keith. I was aware of VoID and the lov engine. That's the right
>> direction but not the level of detail that we have in mind.
>>
>> In the scenario of a (de-centralized) data marketplace, the
>> metadata/description for a dataset stored in a quad store could be
>> expressed as triples with the named graph URI as the subject.
>>
>> A requirement for unambiguous representation of the dataset description
>> would be that values would come from a namespace, preferrably one provided
>> by an ontology so that machine reasoning can be used on the constraints.
>>
>> One test case for update frequency could be:
>>
>> How do I write a SPARQL query to retrieve all versions of DrugBank that
>> have an update frequency that is more frequent than yearly?
>>
>> Note that there is no assumption of only one version of the RDF. There
>> could be tens of versions, or more.
>>
>> where the update frequency could be more specific than the current
>> possibilities in http://purl.org/NET/dady#UpdateFrequency. Presumably,
>> you would have a term URI for each of hourly, daily, weekly, monthly,
>> yearly, etc. making the value machine consumable and 'machine reasonable'.
>> So, the values should not be string literals.
>>
>> BTW, there are other types of information, such as license type that we
>> also would like to encode with precise URIs for the values. These types of
>> information are of great importance to some consumers of the data such as
>> pharmaceutical companies.
>>
>> Cheers,
>> Scott
>>
>>
>> On Wed, Nov 21, 2012 at 3:12 PM, Keith Alexander <
>> keithalexander@keithalexander.co.uk<https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=keithalexander@keithalexander.co.uk>
>> > wrote:
>>
>>> Hi
>>>
>>> On Wed, Nov 21, 2012 at 1:58 PM, M. Scott Marshall <
>>> mscottmarshall@gmail.com<https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=mscottmarshall@gmail.com>
>>> > wrote:
>>>
>>>> In discussions at the Biohackathon 2011 (Kyoto), we agreed that a
>>>> standard data set description would make it easier to consume distributed
>>>> data such as LOD. We created a wishlist of metadata that we would like to
>>>> be able to consume via SPARQL, including date of last update of RDF
>>>> rendering and date of last update of source data (if the RDF is an
>>>> additional representation of that data source). We also discussed update
>>>> frequency as something that we would like to represent in RDF.
>>>
>>>
>>> See
>>> http://rdfs.org/ns/void
>>> http://www.w3.org/TR/void/
>>>
>>>
>>>
>>>> Does anybody know of a good way of representing periodicity in a
>>>> generic fashion (appropriate ontology/namespace)? Of course, just being
>>>> able to represent hourly, daily, weekly, monthly, annually and provide it
>>>> to software agents via SPARQL would be an improvement on having to ask
>>>> around. :)
>>>>
>>>> http://vocab.deri.ie/dady# ?
>>>
>>> there is also the RSS 1 module
>>> http://web.resource.org/rss/1.0/modules/syndication/
>>>
>>> sy:updatePeriod
>>> "Describes the period over which the channel format is updated.
>>> Acceptable values are: hourly, daily, weekly, monthly, yearly. If omitted,
>>> daily is assumed."
>>>
>>> btw, if you don't know it, http://lov.okfn.org/dataset/lov/ is a really
>>> handy vocabulary search engine.
>>>
>>> Best
>>>
>>> Keith
>>>
>>>
>>>> Cheers,
>>>> Scott
>>>>
>>>> --
>>>> M. Scott Marshall, PhD
>>>> MAASTRO clinic, http://www.maastro.nl/en/1/
>>>> http://eurecaproject.eu/
>>>> https://plus.google.com/u/0/114642613065018821852/posts
>>>> http://www.linkedin.com/pub/m-scott-marshall/5/464/a22
>>>>
>>>>
>>>> On Tue, Nov 20, 2012 at 4:49 PM, Sands Alden Fish <sands@mit.edu<https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=sands@mit.edu>
>>>> > wrote:
>>>>
>>>>> Yes, I'd be curious to know the update frequency as well.  This being
>>>>> from September, 2011, we'd be anticipating a new cut right now.
>>>>>
>>>>>
>>>>>
>>>>> On Nov 20, 2012, at 8:52 AM, Michael Hausenblas <
>>>>> michael.hausenblas@deri.org<https://mail.google.com/mail/?view=cm&fs=1&tf=1&to=michael.hausenblas@deri.org>
>>>>> >
>>>>>  wrote:
>>>>>
>>>>> >
>>>>> >> What's the update frequency of this effort?
>>>>> >
>>>>> > AFAIK roughly once per year up to now but Richard would be the more
>>>>> competent person to provide you with an answer ;)
>>>>> >
>>>>> > Cheers,
>>>>> >          Michael
>>>>> >
>>>>> > --
>>>>> > Dr. Michael Hausenblas, Research Fellow
>>>>> > DERI - Digital Enterprise Research Institute
>>>>> > NUIG - National University of Ireland, Galway
>>>>> > Ireland, Europe
>>>>> > Tel.: +353 91 495730
>>>>> > http://mhausenblas.info/
>>>>> >
>>>>> > On 20 Nov 2012, at 13:48, Kingsley Idehen wrote:
>>>>> >
>>>>> >> On 11/20/12 7:59 AM, Michael Hausenblas wrote:
>>>>> >>>> I would like to ask you if you can give me the information, in
>>>>> linked open data project, which data sets makes reference to which data
>>>>> sets and how many links there are between them.
>>>>> >>> http://lod-cloud.net/state/
>>>>> >>
>>>>> >> Michael,
>>>>> >>
>>>>> >> What's the update frequency of this effort?
>>>>> >>
>>>>> >> Kingsley
>>>>> >>>
>>>>> >>>
>>>>> >>> Cheers,
>>>>> >>>        Michael
>>>>> >>>
>>>>> >>> --
>>>>> >>> Dr. Michael Hausenblas, Research Fellow
>>>>> >>> DERI - Digital Enterprise Research Institute
>>>>> >>> NUIG - National University of Ireland, Galway
>>>>> >>> Ireland, Europe
>>>>> >>> Tel.: +353 91 495730
>>>>> >>> http://mhausenblas.info/
>>>>> >>>
>>>>> >>> On 19 Nov 2012, at 15:42, Mary Koutraki wrote:
>>>>> >>>
>>>>> >>>> Dear all,
>>>>> >>>>
>>>>> >>>> I would like to ask you if you can give me the information, in
>>>>> linked open data project, which data sets makes reference to which data
>>>>> sets and how many links there are between them.
>>>>> >>>>
>>>>> >>>> Thank you in advance.
>>>>> >>>>
>>>>> >>>> --
>>>>> >>>> Mary Koutraki
>>>>> >>>> PhD Student on Semantic Web
>>>>> >>>> UVSQ - ETIS Lab
>>>>> >>>>
>>>>> >>>>
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >> --
>>>>> >>
>>>>> >> Regards,
>>>>> >>
>>>>> >> Kingsley Idehen
>>>>> >> Founder & CEO
>>>>> >> OpenLink Software
>>>>> >> Company Web: http://www.openlinksw.com
>>>>> >> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
>>>>> >> Twitter/Identi.ca handle: @kidehen
>>>>> >> Google+ Profile:
>>>>> https://plus.google.com/112399767740508618350/about
>>>>> >> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >
>>>>> >
>>>>>
>>>>
>>>
>>
>>
>> --
>> M. Scott Marshall, PhD
>> MAASTRO clinic, http://www.maastro.nl/en/1/
>> http://eurecaproject.eu/
>> https://plus.google.com/u/0/114642613065018821852/posts
>> http://www.linkedin.com/pub/m-scott-marshall/5/464/a22
>>
>
>
>
> --
>
> Pablo N. Mendes
> http://pablomendes.com
>
Received on Friday, 7 December 2012 17:07:27 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:30:02 UTC