Re: Data sets of LOD

FYI, it has been the practice to put the metadata on a CKAN instance hosted
at TheDataHub.org. There has been work to share the metadata as RDF:
http://lod2.eu/BlogPost/1095-ckan-rdf-output.html

We have also talked about adding more quality-related metadata (including
how current the data is, more fine grained descriptions of what is in
there, what links where, etc.) [1,2,3]. If you are putting together a
consortium, you may want to try to channel some of that support towards the
CKAN folks, so that changes can be made on the source (i.e. the catalog
that holds all of the LOD cloud metadata).

I am cross posting this message to CKAN-discuss to see if anybody picks up
from there.

Cheers,
Pablo

[1]
http://webcache.googleusercontent.com/search?q=cache:cfLMuj9qNbMJ:wiki.ckan.org/Linked_Data+&cd=2&hl=en&ct=clnk
[2] http://wiki.planet-data.eu/uploads/c/c0/D4.1.pdf
[3] http://wiki.planet-data.eu/uploads/d/d7/D2.1.pdf



On Wed, Nov 21, 2012 at 4:35 PM, M. Scott Marshall <mscottmarshall@gmail.com
> wrote:

> Thanks Keith. I was aware of VoID and the lov engine. That's the right
> direction but not the level of detail that we have in mind.
>
> In the scenario of a (de-centralized) data marketplace, the
> metadata/description for a dataset stored in a quad store could be
> expressed as triples with the named graph URI as the subject.
>
> A requirement for unambiguous representation of the dataset description
> would be that values would come from a namespace, preferrably one provided
> by an ontology so that machine reasoning can be used on the constraints.
>
> One test case for update frequency could be:
>
> How do I write a SPARQL query to retrieve all versions of DrugBank that
> have an update frequency that is more frequent than yearly?
>
> Note that there is no assumption of only one version of the RDF. There
> could be tens of versions, or more.
>
> where the update frequency could be more specific than the current
> possibilities in http://purl.org/NET/dady#UpdateFrequency. Presumably,
> you would have a term URI for each of hourly, daily, weekly, monthly,
> yearly, etc. making the value machine consumable and 'machine reasonable'.
> So, the values should not be string literals.
>
> BTW, there are other types of information, such as license type that we
> also would like to encode with precise URIs for the values. These types of
> information are of great importance to some consumers of the data such as
> pharmaceutical companies.
>
> Cheers,
> Scott
>
>
> On Wed, Nov 21, 2012 at 3:12 PM, Keith Alexander <
> keithalexander@keithalexander.co.uk> wrote:
>
>> Hi
>>
>> On Wed, Nov 21, 2012 at 1:58 PM, M. Scott Marshall <
>> mscottmarshall@gmail.com> wrote:
>>
>>> In discussions at the Biohackathon 2011 (Kyoto), we agreed that a
>>> standard data set description would make it easier to consume distributed
>>> data such as LOD. We created a wishlist of metadata that we would like to
>>> be able to consume via SPARQL, including date of last update of RDF
>>> rendering and date of last update of source data (if the RDF is an
>>> additional representation of that data source). We also discussed update
>>> frequency as something that we would like to represent in RDF.
>>
>>
>> See
>> http://rdfs.org/ns/void
>> http://www.w3.org/TR/void/
>>
>>
>>
>>> Does anybody know of a good way of representing periodicity in a generic
>>> fashion (appropriate ontology/namespace)? Of course, just being able to
>>> represent hourly, daily, weekly, monthly, annually and provide it to
>>> software agents via SPARQL would be an improvement on having to ask around.
>>> :)
>>>
>>> http://vocab.deri.ie/dady# ?
>>
>> there is also the RSS 1 module
>> http://web.resource.org/rss/1.0/modules/syndication/
>>
>> sy:updatePeriod
>> "Describes the period over which the channel format is updated.
>> Acceptable values are: hourly, daily, weekly, monthly, yearly. If omitted,
>> daily is assumed."
>>
>> btw, if you don't know it, http://lov.okfn.org/dataset/lov/ is a really
>> handy vocabulary search engine.
>>
>> Best
>>
>> Keith
>>
>>
>>> Cheers,
>>> Scott
>>>
>>> --
>>> M. Scott Marshall, PhD
>>> MAASTRO clinic, http://www.maastro.nl/en/1/
>>> http://eurecaproject.eu/
>>> https://plus.google.com/u/0/114642613065018821852/posts
>>> http://www.linkedin.com/pub/m-scott-marshall/5/464/a22
>>>
>>>
>>> On Tue, Nov 20, 2012 at 4:49 PM, Sands Alden Fish <sands@mit.edu> wrote:
>>>
>>>> Yes, I'd be curious to know the update frequency as well.  This being
>>>> from September, 2011, we'd be anticipating a new cut right now.
>>>>
>>>>
>>>>
>>>> On Nov 20, 2012, at 8:52 AM, Michael Hausenblas <
>>>> michael.hausenblas@deri.org>
>>>>  wrote:
>>>>
>>>> >
>>>> >> What's the update frequency of this effort?
>>>> >
>>>> > AFAIK roughly once per year up to now but Richard would be the more
>>>> competent person to provide you with an answer ;)
>>>> >
>>>> > Cheers,
>>>> >          Michael
>>>> >
>>>> > --
>>>> > Dr. Michael Hausenblas, Research Fellow
>>>> > DERI - Digital Enterprise Research Institute
>>>> > NUIG - National University of Ireland, Galway
>>>> > Ireland, Europe
>>>> > Tel.: +353 91 495730
>>>> > http://mhausenblas.info/
>>>> >
>>>> > On 20 Nov 2012, at 13:48, Kingsley Idehen wrote:
>>>> >
>>>> >> On 11/20/12 7:59 AM, Michael Hausenblas wrote:
>>>> >>>> I would like to ask you if you can give me the information, in
>>>> linked open data project, which data sets makes reference to which data
>>>> sets and how many links there are between them.
>>>> >>> http://lod-cloud.net/state/
>>>> >>
>>>> >> Michael,
>>>> >>
>>>> >> What's the update frequency of this effort?
>>>> >>
>>>> >> Kingsley
>>>> >>>
>>>> >>>
>>>> >>> Cheers,
>>>> >>>        Michael
>>>> >>>
>>>> >>> --
>>>> >>> Dr. Michael Hausenblas, Research Fellow
>>>> >>> DERI - Digital Enterprise Research Institute
>>>> >>> NUIG - National University of Ireland, Galway
>>>> >>> Ireland, Europe
>>>> >>> Tel.: +353 91 495730
>>>> >>> http://mhausenblas.info/
>>>> >>>
>>>> >>> On 19 Nov 2012, at 15:42, Mary Koutraki wrote:
>>>> >>>
>>>> >>>> Dear all,
>>>> >>>>
>>>> >>>> I would like to ask you if you can give me the information, in
>>>> linked open data project, which data sets makes reference to which data
>>>> sets and how many links there are between them.
>>>> >>>>
>>>> >>>> Thank you in advance.
>>>> >>>>
>>>> >>>> --
>>>> >>>> Mary Koutraki
>>>> >>>> PhD Student on Semantic Web
>>>> >>>> UVSQ - ETIS Lab
>>>> >>>>
>>>> >>>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>
>>>> >>
>>>> >> --
>>>> >>
>>>> >> Regards,
>>>> >>
>>>> >> Kingsley Idehen
>>>> >> Founder & CEO
>>>> >> OpenLink Software
>>>> >> Company Web: http://www.openlinksw.com
>>>> >> Personal Weblog: http://www.openlinksw.com/blog/~kidehen
>>>> >> Twitter/Identi.ca handle: @kidehen
>>>> >> Google+ Profile: https://plus.google.com/112399767740508618350/about
>>>> >> LinkedIn Profile: http://www.linkedin.com/in/kidehen
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >>
>>>> >
>>>> >
>>>>
>>>
>>
>
>
> --
> M. Scott Marshall, PhD
> MAASTRO clinic, http://www.maastro.nl/en/1/
> http://eurecaproject.eu/
> https://plus.google.com/u/0/114642613065018821852/posts
> http://www.linkedin.com/pub/m-scott-marshall/5/464/a22
>



-- 

Pablo N. Mendes
http://pablomendes.com

Received on Thursday, 22 November 2012 10:52:28 UTC