Re: [BP - MET] - Best Practices - Guidance on the Provision of Metadata

Hi Laufer,

I think that data semantics should be provided by a domain vocabulary
(ontology). DCAT shouldn't provide means to describe data semantics,
instead it should provide means to make associations between datasets and
domain vocabularies, which describes the semantics of the data. Does it
make sense for you?

kind regards,
Bernadette


2014-05-15 16:12 GMT-03:00 Laufer <laufer@globo.com>:

> Makx,
>
> Yes, we could start from DCAT. I think that as a guidance on the provision
> of metadata we should give examples and suggestions of the use of reference
> vocabularies. If we have a rdf linked dataset, for example, we can think
> about VoID.
>
> DCAT application profile for data portals (correct me if I am wrong) is a
> way of concentrating metadata from data portals that adhere to DCAT AP. In
> my first set of roles of the ecosystem, maybe we could consider the
> Metadata Broker of the basic use case of DCAT AP as a kind of Broker of
> Brokers.
>
> One of the works of the DWBP WG is to extend DCAT. I think that,
> currently, metadata about specific formats are not included (or have a
> pointer) in DCAT.
>
> For example (a dataset description entry in DCAT):
>
> :dataset-002-csv
>        a dcat:Distribution ;
>        dcat:accessURL <http://example.org/dataset-002.html> ;
>        dcat:mediaType "text/csv" ;
>
> indicates that you can access a csv at <http://example.org/dataset-002.html>.
>
> But what about the metadata that could explain the semantics of the csv itself (things that are being discussed by the CSV on the Web WG)?
>
>
> DCAT did not talk about the semantics of the datasets.
>
> What I would like to discuss in more details are the roles of the participants in the Data on the Web ecosystem (as DCAT AP), and, then, from the ecosystem metamodel extract the metadata that are suitable for each one of them.
>
>
> So, I think we could suggest to the data publishers (providers?) what metadata to publish for each task.
>
> Laufer
>
>
>
>
> 2014-05-15 15:02 GMT-03:00 Makx Dekkers <mail@makxdekkers.com>:
>
>> Laufer,
>>
>>
>>
>> Could we maybe start from DCAT http://www.w3.org/TR/vocab-dcat/? That
>> W3C Recommendation was specifically designed to describe data on the Web.
>> It defines a metadata language that includes some of the metadata types you
>> list in your message. It also distinguishes between the conceptual
>> characteristics of the data (things you would use for search) and the
>> actual, downloadable distribution of the data.
>>
>>
>>
>> There is also a DCAT application profile for data portals in Europe (
>> https://joinup.ec.europa.eu/asset/dcat_application_profile/asset_release/dcat-application-profile-data-portals-europe-final#download-links)
>> that gives additional rules and constraints for the use of DCAT in a
>> network of data portals in Europe. One thing that the DCAT-AP defines is
>> the minimum set of metadata elements to be provided, actually only a name
>> (dct:title) and a description (dct:description) of the data set, and a URL
>> for its distribution (dcat:accessURL). A small number of elements are
>> recommended if available.
>>
>>
>>
>> Could we do something similar?
>>
>>
>>
>> Makx.
>>
>>
>>
>>
>>
>> *From:* Laufer [mailto:laufer@globo.com]
>> *Sent:* Thursday, May 15, 2014 4:36 PM
>> *To:* Bernadette Farias Loscio; Carlos Iglesias; Makx Dekkers; DWBP
>> Public List
>> *Subject:* [BP - MET] - Best Practices - Guidance on the Provision of
>> Metadata
>>
>>
>>
>> Hi Bernadette, Carlos, Makx, all DWBP members,
>>
>>
>>
>> I created a page on the wiki, "Best Practices – Guidance on the Provision
>> of Metadata", where we can put the information about this topic. I took the
>> liberty to define a prefix in the subject of the e-mails related to these
>> discussions: [BP- MET].
>>
>>
>>
>> I would like to expose some thoughts that I think are related to the data
>> on the web ecosystem. I see a kind of data architecture that has three big
>> roles: a data Publisher, a data Consumer and a data Broker. The Broker is
>> the one that has information that can be used by the Consumer to find data
>> published by the Publisher.
>>
>>
>>
>> As an example of Brokers we can think about implementations of CKAN, used
>> by data.gov, dados.gov.br, etc. CKAN has metadata (provided by
>> Publishers) that are useful for Consumers to find data. CKAN is a registry
>> and can also be a repository for the data to be consumed. Almost all use
>> cases of DWBP WG are examples of Brokers.
>>
>>
>>
>> At the same time, data published in CKAN implementations can have
>> multiple formats, as CSV, for example. Once a Consumer chooses some data to
>> use from a Publisher, she needs another kind of metadata to understand how
>> to access the data and its semantics.
>>
>>
>>
>> I propose to create categories and types of metadata. I see two
>> categories: metadata for search and metadata for use. Each of these
>> categories would have types of metadata. For example:
>>
>>
>>
>> Metadata Types for Search
>>
>> Human Content Description (free text)
>>
>> Machine Content Description (vocabularies)
>>
>> Provenance
>>
>> License
>>
>> Revenue
>>
>> Credentials
>>
>> Quality / Metrics
>>
>> Release Schedule
>>
>> Data Format
>>
>> Data Access
>>
>>
>>
>> Metadata Types for Use
>>
>> URI Design Principles
>>
>> Machine Access to Data
>>
>> API specification
>>
>> Format Specification
>>
>>
>>
>> The Brokers itself have another kind of metadata about its own
>> information.
>>
>>
>>
>> Maybe in the future a Consumer will search for data no more in these
>> Brokers (with its catalogues) but they will use search engines that could
>> obtain the metadata (both the search and the use) using its crawlers. But
>> now, we have this heterogeneous world of data that is one of the
>> characteristic of the web since its beginning.
>>
>>
>>
>> Contributions of all members of the DWBP WG will be appreciated.
>>
>>
>>
>> Best Regards,
>>
>> Laufer
>>
>>
>> --
>> .  .  .  .. .  .
>> .        .   . ..
>> .     ..       .
>>
>
>
>
> --
> .  .  .  .. .  .
> .        .   . ..
> .     ..       .
>



-- 
Bernadette Farias Lóscio
Centro de Informática
Universidade Federal de Pernambuco - UFPE, Brazil
----------------------------------------------------------------------------

Received on Thursday, 15 May 2014 19:38:07 UTC