Comments on Best Practice document

All,

 

First of all, apologies that I haven't been able to give proper
attention to this work over the past couple of weeks due to heavy
workload and family matters.

 

I now have had a chance to read through the current version and I'd like
to congratulate the people who worked so hard on it. It is coming
together quite nicely!

 

However, I do have some comments. Not things that are show-stoppers but
maybe things we can talk about for next versions.

 

My general point is a question that has been lingering in my mind for a
while now. The question is: how do we determine what is "best" practice?
Reading the document, I can sort of understand why the things mentioned
in the titles and summary statements of the BPs are reasonable things to
do, but for things to be declared "best" practice, I would maybe expect
a brief analysis of the options and relative merits of possible
alternatives. Currently, the best practices described are what some of
us, or all of us in this small group, think is the best way to meet a
requirement. But is that sufficient justification?

 

An example is Best Practice 1: Provide metadata that says that Data on
the Web MUST be described by metadata.  The "Why" section makes a
general statement that without metadata, you can't find anything. A
reader might ask: "What about Google? They've been working fine without
metadata!" Some people might create a landing page for their dataset and
do some smart SEO on it. What would our response be?

Further down it says that DCAT should be used to describe datasets as a
whole. Now someone might ask: "So is publishing CKAN metadata format
<http://ckan.org/features-1/metadata/>  or schema.org Dataset
<http://schema.org/Dataset>  somehow worse practice?" To be clear, I do
strongly agree that people should provide metadata and I do agree that
DCAT has certain advantages, but shouldn't the BP say why the proposed
approach is better than others?

 

Reading on, I could come up with similar questions for many of the other
Best Practices. In many cases, the "Why" section explains more why we
think you need a best practice for a particular aspect, but it does not
always clearly justify the specific best practice proposed. Possible
approaches sometimes contain statements that we might consider
self-evident, universal truths, e.g. "Metadata is best provided using
RDF vocabularies". Some people might agree, some might not. In this
case, someone who works in a JSON or XML environment might not agree for
practical reasons, and happily ignore the best practice. Is that what we
want?

 

I don't want to delay this version but maybe we can have some of these
issues on the agenda of the group for further discussion.

 

Makx.

 

 

Received on Thursday, 22 January 2015 18:18:35 UTC