W3C home > Mailing lists > Public > public-dwbp-comments@w3.org > November 2016

Re: Contribution to the consultation on data on the Web Best Practices

From: Phil Archer <phila@w3.org>
Date: Fri, 25 Nov 2016 22:14:01 +0000
To: "CARRARA, Wendy" <wendy.carrara@capgemini.com>, "public-dwbp-comments@w3.org" <public-dwbp-comments@w3.org>
Message-ID: <632bfb74-e197-1e64-43fa-7639d3b27fcb@w3.org>
Dear Wendy,

Thanks very much for these comments. Please see inline below for our 
responses. We'd be grateful if you could signal whether you are 
satisfied with our actions.

On 17/11/2016 15:17, CARRARA, Wendy wrote:
> Dear Phil, dear contributors to the data on the Web Best Practices.
> We are grateful to see that initiatives are taken to promote data on the web and more importantly to see promotion of best practices and standards. As more and more data is being made available on the web, it is in our view, paramount for organisations publishing data to recourse systematically to standards and best practices. Too many a times are we seeing data publishers simply park their data online and consider it done. Data is not published on the web for it to die there. Data is published so it is discoverable, re-usable and made available in a sustainable fashion.
> As we understand the vocation of the best practices put forward by W3C is international, we wish to nonetheless encourage W3C to further reference work done at the European level within i.a. section 8.9 Data Vocabularies. The work performed at the European level is largely open and can save many, both time and effort in moving forward in their data initiatives. The two points below are the most poignant we believe are the most valuable to cross-reference.
> -        Multilingualism: the web is an international space and the digital world knows different barriers then those imposed by sovereign states and their respective languages. More and more data portal owners and companies are investigating making their metadata and data available in different languages. The European Data Portal is one of the very first portals to translate metadata into 18 languages. This is a grassroot initiative which has had the benefit of shedding light on the needs and demand for further quality metadata on the one hand and multilingual metadata on the other. Core vocabularies, as underlined in the best practices already have a key role to play in ensuring common labels are applied. The multilingual thesaurus Eurovoc<http://eurovoc.europa.eu/>, does, at a European level, map of a number of key labels in 24 languages; languages that are moreover used beyond the strict borders of the European Union. This is a ressource than can be valuable globally. Moreover, in this field much work has been with respect to Controlled Vocabularies<http://joinup.ec.europa.eu/site/core_vocabularies/registry/adms-skos/> and Core Vocabularies<http://joinup.ec.europa.eu/site/core_vocabularies/registry/corevoc/>.

The provision of multilingual labels for vocabularies is certainly a 
best practice, but we feel this is more related to the development of 
vocabularies than to publishing datasets. Nevertheless, please note that 
BP 13 https://www.w3.org/TR/dwbp/#LocaleParametersMetadata encourages 
the use of locale-neutral formats.

But we have gone a little further. In the editor's draft of BP 15 
(http://w3c.github.io/dwbp/bp.html#ReuseVocabularies), I have appended 
the sentence " In the context of the Web, using unambiguous, Web-based 
identifiers (URIs) for standardized vocabulary resources is an efficient 
way to do this" with ", noting that the same URI may have multilingual 
labels attached for greater cross-border interoperability."

> -        The DCAT Application profile<http://joinup.ec.europa.eu/site/core_vocabularies/registry/dcat-ap/> for data portals in Europe (DCAT-AP) is a specification based on the Data Catalogue vocabulary (DCAT) for describing public sector datasets in Europe. Its basic use case is to enable a cross-data portal search for data sets and make public sector data better searchable across borders and sectors. This can be achieved by the exchange of descriptions of data sets among data portals. The European Data Portal has built a mapping between a number of catalogue solutions and the DCAT-AP for which the source code is open and available on GitLab<https://gitlab.com/groups/european-data-portal>. Moreover the DCAT-AP builds on an existing W3C standard, which would only give further credit to and underline the relevance of the work conducted by W3C in this field for quite some time now.

I admit I thought we had already included a mention of DCAT-AP. I have 
added such a mention to BP 1 
http://w3c.github.io/dwbp/bp.html#ProvideMetadata where it now says

"when defining machine-readable metadata, reusing existing standard 
terms and popular vocabularies are strongly recommended. For example, 
Dublin Core Metadata (DCMI) terms [DCTERMS] and Data Catalog Vocabulary 
[VOCAB-DCAT] can be used to provide descriptive metadata. Such 
vocabularies are designed to be very flexible so it is often helpful to 
use a specific profile of a vocabulary such as the European Commission's 

> Finally, there is an aspect that could be further explored which is around monitoring the re-use of data on the web. What is W3C's view on this topic?
We have a whole new vocabulary for this https://www.w3.org/TR/vocab-duv/ 
a final version of which will be published very shortly (with a very 
small delta from what's there now). I will be promoting this and our 
data quality vocabulary as DCAT extensions at the SDSVoc workshop next week

Thanks again for your time and attention,


> As a disclaimer, please note that this contribution does not represent an endorsement by the European Institutions, nor the European Data Portal.
> Best regards,
> Wendy
> Wendy Carrara
> Project Manager European Data Portal
> [cid:image001.png@01D1804D.DB3F7060]<http://www.europeandataportal.eu/>
> Tel.: +33 (0) 1 49 67 31 68 - Mob.: +33 (0) 671 097 397
> wendy.carrara@capgemini.com-<mailto:wendy.carrara@capgemini.com-> www.fr.capgemini.com<http://www.fr.capgemini.com/>
> This message contains information that may be privileged or confidential and is the property of the Capgemini Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorized to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.


Phil Archer
Data Strategist, W3C


+44 (0)7887 767755
Received on Friday, 25 November 2016 22:14:16 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:38:14 UTC