Re: comments on Data on the Web Best Practices

Dear Bernadette, sorry for my late reply, i read the new draft  and, to me, it is a significant improvement wrt the first version

with best regards
Andrea Maurino


Il giorno 17/giu/2015, alle ore 23:57, Bernadette Farias Lóscio <bfl@cin.ufpe.br<mailto:bfl@cin.ufpe.br>> ha scritto:

Dear Andrea,

As mentioned in my last message [1], we're planning to publish the 2nd draft of the DWBP document and it is really important to know if you agree with our comments about your feedback on the FPWD of DWBP document.

If possible, please let us know if you agree with our comments no later than next Friday.

Thank you!
Bernadette

[1] https://lists.w3.org/Archives/Public/public-dwbp-wg/2015Jun/0022.html


2015-06-11 11:48 GMT-03:00 Bernadette Farias Lóscio <bfl@cin.ufpe.br<mailto:bfl@cin.ufpe.br>>:
Dear Andrea Maurino,

Thanks a lot for your comments on the FPWD of the DWBP document! After gathering some feedback from the community some changes were made and we're planning to publish a 2nd draft [1].

In the following, you can find some comments about your feedback on the FPWD.

Bp3  Use standard terms to define metadata

Issue 6: IMHO there is the need that at least a very well defined subset of metadata terms MUST be described by means of standard terms and consequently if they must be expressed with well-known RDF vocabulary. Example of such mandatory list of metadata terms could include the owner, the type of license associated to the data, and date of last modification.

Changes were made on the metadata section and specific vocabularies are mentioned in the Possible Approach to Implementation section [2].

Best Practice 6: Provide data license information

According to the experience of Comsode project  license is a mandatory requirement for publishing data on the web due to the fact without a license there is no clear indication about the limits (if any) of usability of such data and this lack significantly reduce the possibility to have a real web of data. It is possible to suggest that in case someone publishes data without license this will imply that such data can be consumed for free by both humans and machines but they cannot be modified, reused an so on without an explicit acceptation of the data owner.

I am not sure if we can make such suggestion because this may depend from the policies of the organization. I think we can only suggest that data license information should be available.

Best Practice 8: Provide data quality information

Issue 7 I suggest to draw some strategies related to how attach quality information. In some case such information are defined inside data (for example when the time of last modification of an item is part of the dataset itself), in other situations there are the need to express quality dimensions related to schema description only (e.g. conciseness of schema) , or  related to the dataset. I also suggest (but it is clear that I'm a little biased on such topic :) ) to better describe how to describe the quality information (including quality dimensions, adopted quality metric, and quality value see for example as starting point [1])

Thanks a lot for your suggestions, but I suggest to keep this discussion for the Data Quality Vocabulary document [3].

Best Practice 9: Provide versioning information

 This is a crucial problem in particular in the case of linked data due to possible impact wrt. existing interlinked resources. Some good practice could be discussed

In the current version of the document there is a section for Data Versioning [4] and two BP(Provide versioning information and Provide version history) are proposed.


Best Practice 20: Preserve people's right to privacy

This a big issue because if it is correct to protect the people's right to privacy there is also the "right to know" about activities realized by public administrations (for example legal sentences); In Italy, just as an example,  personal information including salary related to person working in Public administration at higher level or consultants paid with public money must to be released as open data due to Italy transparency decree for 5 years (after such period there is "the right to be forgotten" that many of you known related to the google vs European Union case).

Thus I suggest to change the best practice in " Data publishers should preserve the privacy of individuals according to the law of the country of data owner ".

Some actions were taken to change BP for Sensistive Data [5]. Changes will be made in the next version.


Best Practice 25: Provide data up to date

Please consider that this BP is strictly related to the data quality bp due to the fact the way in which are calculated temporal-related  quality dimensions  and such two BP must be correlated and coherent.

This BP concerns how to keep data up to date instead of providing information if data is being updated as expected. I think that the discussion about data quality assessment is in the scope of the Data Quality Vocabulary [3].

kind regards,
Bernadette

[1] http://w3c.github.io/dwbp/bp.html
[2] http://w3c.github.io/dwbp/bp.html#metadata
[3] http://w3c.github.io/dwbp/vocab-dqg.html
[4] http://w3c.github.io/dwbp/bp.html#dataVersioning
[5] http://w3c.github.io/dwbp/bp.html#sensitive









Best regards



Andrea  Maurino



[1] http://ceur-ws.org/Vol-1184/ldow2014_paper_09.pdf





--
Bernadette Farias Lóscio
Centro de Informática
Universidade Federal de Pernambuco - UFPE, Brazil
----------------------------------------------------------------------------



--
Bernadette Farias Lóscio
Centro de Informática
Universidade Federal de Pernambuco - UFPE, Brazil
----------------------------------------------------------------------------

Received on Friday, 19 June 2015 10:39:12 UTC