W3C home > Mailing lists > Public > public-dxwg-wg@w3.org > June 2018

Re: Best practice for a loosely-structured catalog?

From: Jakub Klímek <jakub@jakubklimek.com>
Date: Fri, 8 Jun 2018 10:11:14 +0200
Message-ID: <CAEOz=_uQurXz9dQdi8U+zE=m2cVCW_ihvvpWyb2OMkW92AcQRw@mail.gmail.com>
To: Simon.Cox@csiro.au
Cc: public-dxwg-wg@w3.org, Jonathan.Yu@csiro.au
Dear Simon,

could this be made an issue on GitHub so that a proper discussion can be
had?

In Czechia we are dealing with this approach on a daily basis and so far we
have managed to keep the metadata more or less "clean", in the sense that
"a bag of files" is to be split into properly described datasets with
single distributions, with the only problem being the missing support for a
dataset series to group them together, which is being solved in this DCAT
revision.

>From your proposal, especially the <file*> parts it may seem that you would
have a dcat:Dataset with no dcat:Distributions, linking directly to
downloadable files with dct:relation and no metadata about them? That seems
very messy to me. Hence the request for possibility to have a discussion on
GitHub regarding this.

Best regards,

Jakub Klímek

On Fri, Jun 8, 2018 at 3:44 AM <Simon.Cox@csiro.au> wrote:

> Catalogueers:
>
>
>
> I’ve been doing some investigations of some local repositories and
> catalogues, and have uncovered that in many cases ‘datasets’ are ‘just a
> bag of files’. There is no distinction made between part/whole,
> distribution (representation), and other kinds of relationship (e.g.
> documentation, schema, supporting documents). So while the precision we are
> aiming for in DCAT is clearly valuable in terms of semantics, it is
> difficult to implement on these legacy systems. Mostly I see people using
> the Dataset-distribution-> relationship for everything … which is clearly
> incorrect in many cases. But I doubt if we are unusual in this.
>
>
>
> I’m thinking about how to advise on this, while not actually breaking
> DCAT.
>
>
>
> If we made dcat:distribution a sub-property of dct:relation
>
>
>
> dcat:distribution rdfs:subPropertyOf dct:relation .
>
>
>
> then I think we can have a reasonable recommendation to the simple
> repositories.
>
> We could tell repositories that use the ‘just a bag of files’ approach to
> say
>
>
>
>                :Dataset987 a dcat:Dataset ;
>
>                               dct:relation <file1> , <file2> , <file3> ,
> <file4> , <file5> , <file6> , <file7> … .
>
>
>
> which would not be inconsistent with a later reclassification to
>
>
>
>                :Dataset987 a dcat:Dataset ;
>
>                               dct:hasPart <file1> , <file2> ;
>
>                               dcat:distribution <file3> , <file4> ;
>
>                               dct:conformsTo <file5> ;
>
>                               dct:requires <file6> ;
>
> dct:references <file7> .
>
>
>
> If this is not all mad, I will add a new use-case - something like
> ‘Mapping from simple repository model’ – as justification, and propose this
> tiny enhancement.
>
>
>
> Simon
>
>
>
> *Simon J D Cox *
>
> Research Scientist - Environmental Informatics
>
> Team Leader – Environmental Information Infrastructure
>
> CSIRO Land and Water <http://www.csiro.au/Research/LWF>
>
>
>
> *E* simon.cox@csiro.au *T* +61 3 9545 2365 <+61%203%209545%202365> *M* +61
> 403 302 672 <+61%20403%20302%20672>
>
>    *Mail:* Private Bag 10, Clayton South, Vic 3169
>
> *   Visit: *Central Reception, Research Way, Clayton, Vic 3168
>
> *   Deliver: *Gate 3, Normanby Road, Clayton, Vic
> <https://maps.google.com/?q=3,+Normanby+Road,+Clayton,+Vic&entry=gmail&source=g>
> 3168
>
> people.csiro.au/Simon-Cox
>
> orcid.org/0000-0002-3884-3420
>
> researchgate.net/profile/Simon_Cox3
> <https://www.researchgate.net/profile/Simon_Cox3>
>
> github.com/dr-shorthair
>
> lov.okfn.org/dataset/lov/agents/Simon%20Cox
>
> Twitter @dr_shorthair <https://twitter.com/dr_shorthair>
>
> Skype dr_shorthair
>
> https://xkcd.com/1810/
>
>
>
> *PLEASE NOTE*
>
> The information contained in this email may be confidential or privileged.
> Any unauthorised use or disclosure is prohibited. If you have received this
> email in error, please delete it immediately and notify the sender by
> return email. Thank you. To the extent permitted by law, CSIRO does not
> represent, warrant and/or guarantee that the integrity of this
> communication has been maintained or that the communication is free of
> errors, virus, interception or interference.
>
>
>
> *Please consider the environment before printing this email.*
>
>
>
>
>
Received on Friday, 8 June 2018 08:27:07 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 30 October 2019 00:15:43 UTC