Re: [BP - MET] - Best Practices - Guidance on the Provision of Metadata

Hi, Ghislam,

>> I could consider also metada "computed" based on some provenance data +
metrics. For e.g.: If a dataset is published by
> a  "certified organization" and it is reused by many users/applications,
then it has higher quality.
Interesting. I think it would be necessary to have vocabularies to describe
these "computed" metadata.

>Metadata Types for Use
>URI Design Principles
>Machine Access to Data
>API specification
>I am not sure to understand the above types. Could you give us an example
why "vocabularies" are not in this list,
>but "URI design principles" is here? One may think that there is no
principles in designing URIs for vocabs.
This is a starting list. We have to discuss it a lot. The name "URI Design
Principles" could not explain what I have thought. I was thinking about a
rule that a dataset could have, in a way that a Consumer could build an
appropriate URI for accessing a specific resource (the definition of
resource used in rdf).

>> What's the difference between "format spec" and "data format"?
I think that the terminology has to be well-defined and some terms are
difficult to be understood only by their names.
Some information that is related to the "Use" task could be of the Consumer
interest when choosing the dataset, during the "Search" task . The two
types are related: "Format Specification" and "Data Format". "Date Format"
could be for example, "CSV" and "Format Specification" could be the
semantics of the CSV file (defined by the CSV on the Web WG). But, again,
the names of these metada types have to be defined and described.

>>As others pointed out, we could define a small set of mandatory field
when providing the metadata.
+1. DCAT AP also did this kind of thing.

Cheers,
Laufer

2014-05-16 4:26 GMT-03:00 Ghislain Atemezing <auguste.atemezing@eurecom.fr>:

> Hi Laufer, all,
> Thanks for this great starting discussion. Find below my 2 cents ...
>
>> I created a page on the wiki, "Best Practices – Guidance on the
>> Provision of Metadata", where we can put the information about this
>> topic. I took the liberty to define a prefix in the subject of the
>> e-mails related to these discussions: [BP- MET].
>>
>> I would like to expose some thoughts that I think are related to the
>> data on the web ecosystem. I see a kind of data architecture that has
>> three big roles: a data Publisher, a data Consumer and a data Broker.
>> The Broker is the one that has information that can be used by the
>> Consumer to find data published by the Publisher.
>>
>> As an example of Brokers we can think about implementations of CKAN,
>> used by data.gov <http://data.gov>, dados.gov.br <http://dados.gov.br>,
>>
>> etc. CKAN has metadata (provided by Publishers) that are useful for
>> Consumers to find data. CKAN is a registry and can also be a repository
>> for the data to be consumed. Almost all use cases of DWBP WG are
>> examples of Brokers.
>>
>> At the same time, data published in CKAN implementations can have
>> multiple formats, as CSV, for example. Once a Consumer chooses some data
>> to use from a Publisher, she needs another kind of metadata to
>> understand how to access the data and its semantics.
>>
>> I propose to create categories and types of metadata. I see two
>> categories: metadata for search and metadata for use. Each of these
>> categories would have types of metadata. For example:
>>
>>  +1. I could consider also metada "computed" based on some provenance
> data + metrics. For e.g.: If a dataset is published by a "certified
> organization" and it is reused by many users/applications, then it has
> higher quality.



>  Metadata Types for Search
>>
>> Human Content Description (free text)
>>
> ..and categories/themes
>
>
>> Machine Content Description (vocabularies)
>>
>> Provenance
>>
>> License
>>
>> Revenue
>>
>> Credentials
>>
>> Quality / Metrics
>>
>> Release Schedule
>>
>> Data Format
>>
>> Data Access
>>
> +1 for all this first metadata types
>
>
>> Metadata Types for Use
>>
>> URI Design Principles
>>
>> Machine Access to Data
>>
>> API specification
>>
>>  I am not sure to understand the above types. Could you give us an
> example why "vocabularies" are not in this list, but "URI design
> principles" is here? One may think that there is no principles in designing
> URIs for vocabs.
>
>> Format Specification
>>
>>  What's the difference between "format spec" and "data format"?
>
> As others pointed out, we could define a small set of mandatory field when
> providing the metadata.
>
> Thanks again for taking care of this section.
>
> Cheers,
> Ghislain
>
> --
> Ghislain Atemezing
> EURECOM, Multimedia Communications Department
> Campus SophiaTech
> 450, route des Chappes, 06410 Biot, France.
> e-mail: auguste.atemezing@eurecom.fr & ghislain.atemezing@gmail.com
> Tel: +33 (0)4 - 9300 8178
> Fax: +33 (0)4 - 9000 8200
> Web: http://www.eurecom.fr/~atemezin
> Google+:http://google.com/+GhislainATEMEZING
> Twitter:@gatemezing
>



-- 
.  .  .  .. .  .
.        .   . ..
.     ..       .

Received on Monday, 19 May 2014 15:04:20 UTC