Re: Collection class for DCAT

Hi Vasily,

We already fixed this need through Open Assets, an extension of DCAT in
which a Catalog is just a case of Dataset so you can have a hierarchy of
catalogs with as many levels of catalogs and granularity of them as needed,
without defining an intermediate class as collection (although we considered
a catalog as a subclass of a more general one, container, in order to
contain in such a kind of catalog elements of a same kind of datasets (I
mean a hierarchy of catalogs and catalogs that contain elemental datasets of
the same type), and we also defined another subclass of container:
repository in order to contain catalogs and repositories without making any
difference on their types of contents) . Besides of this you can also
federate catalogs through its semantics. Open Assets approach also fixes the
current deficiencies of DCAT for dealing with IPR issues. I will talk about
it during next W3C GLD meeting but you can already find some information
about Open Assets in the minutes of last W3C eGov meeting.

May be you could consider to look at Open Assets approach to containers
(catalogs and repositories) as a more general solution for dealing with
issues as the one presented by you and even for other more complex ones.

Best regards,


Serafin Olcoz




El 07/12/12 15:47, "vasily.bunakov@stfc.ac.uk" <vasily.bunakov@stfc.ac.uk>
escribió:

> This is a suggestion for a new class and properties in DCAT, also an example
> of how we might apply them ­ which example is, in the same time, an
> illustration why a new class and properties seem required.
>  
> We may need a Collection class in DCAT as Catalog seem too high level, and
> Dataset too granular for the actual data catalogues modeling in some
> situations. You can of course model a collection with what is currently in
> DCAT (a Catalog or a Dataset) but this may not be what DCAT designers wanted:
> a Catalog seems intended for modeling top level resource of a "data
> publisher", and a Dataset - for modeling a set of data records without
> identity for each record. A Collection would allow to model data aggregations
> in situation where a Collection member should have a clear identity.
>  
> What else we may need is giving a structure for Collections as just
> introducing a new Collection class sitting between Catalog and Dataset may not
> be enough for modeling the actual data organization. We may need a few
> properties for Collection then:
> * "related": to link to other Collection or anything else,
> * "reference": to designate homepage or other stable reference for the
> Collection; it may be an inverse functional property similar to "homepage" for
> the Catalog,
> * "contains": to designate parts of the Collection ­ other Collections or
> perhaps standalone datasets (An alternative to having this property may be
> Collection inheritance from other Collection but this may contradict the
> spirit of DCAT which seems avoiding inheritance)
>  
> A possible use of a new Collection class and properties is illustrated by the
> following example with 4 parts in it:
>  
> 1) http://www.esds.ac.uk/government/frs/ (Family Resources Survey homepage on
> the UK Data Archive portal)
> This can be a top level data "Collection" - registered in the UK Data Archive
> "Catalog".
>  
> 2) 
> http://farm.ccsr.ac.uk/cgi-bin/esds/nsproxy/nsproxy2011.cgi?http://www.statist
> ics.gov.uk/StatBase/Product.asp?vlnk=9267 (This link is  recommended by the UK
> Data Archive for detailed information about the Family Resources Survey; it
> leads to the Office For National Statistics)
> So 2 can be described as "related"  to 1.
> DCAT as a whole is much focussed on data and less on intellectual entities;
> the latter however seem need some representation in DCAT so "related" property
> could be a very lightweight and generic means to refer to the intellectual
> entities  (research programs, continuous surveys etc.), e.g. to the rationale
> for having the data in the Collection.
> 
> 
> 3) http://www.esds.ac.uk/findingData/frsTitles.asp  (Family Resources Survey
> sub-series driven by different licensing)
> Top Collection in 1 then can be thought of as it "contains" Collections from
> 3, with the rationale for having sub-Collections described via "related"
> property, i.e. links to the specific licensing descriptions.
> 
> 
> 4) http://www.esds.ac.uk/findingData/snDescription.asp?sn=7085 (Example for
> collection of datasets and associated materials in Excel, PDF, and HTML
> formats for a particular annual survey)
> Each Collection listed in 3 then "contains" a number of Collections like 4. In
> this example, the homepage is obviously prone to change if the server
> technology changes, so something like "sn7085" can be a better Collection
> "reference".
>  
> If we do not have a Collection class and a few properties introduced then it
> seems hard to properly model a Catalog structure in cases like in the example
> above.
>  
> With kind regards,
> Vasily Bunakov
> STFC Scientific Computing
> E-mail: vasily.bunakov@stfc.ac.uk
>  
>

Received on Friday, 7 December 2012 15:07:30 UTC