- From: Andrea Perego <andrea.perego@jrc.ec.europa.eu>
- Date: Tue, 31 May 2016 11:18:37 +0200
- To: W3C DWBP WG - Comments <public-dwbp-comments@w3.org>
Dear DWBP WG,
Congratulations for the great work done!
I would like to contribute a couple of comments on the BP document,
concerning data versioning and access. I don't know if they can be
addressed at this stage, but I thought they may be worth to be mentioned
- at least to know the position of the WG.
Thanks!
Andrea
----
1. Data versioning
The issue is about a specific metadata field, namely the date of last
modification of a dataset (dct:modified in DCAT).
This field conveys useful information for end users - e.g., they can
check whether the data are actually recent enough for their purposes -
and it is sometimes considered more important than the dataset issue date.
Going through the BP doc, I realised dct:modified occurs just in one of
the examples (#4), and it is not included in BP2 in the list of
recommended fields for datasets and distributions. There's actually
another field (dct:accrualPeriodicity) that is referred to from the data
versioning section, as a way to inform end users about the data update
frequency. Nonetheless, the two fields are not mutually exclusive, and
dct:accrualPeriodicity cannot replace dct:modified when the update
frequency is "irregular" or "unknown".
May I ask which is the position of the WG on this issue?
2. Data access
There's a scenario that I'm not sure it is addressed, at least
explicitly. This concerns data that, to be accessed, require users to
register. This is different from data that can be accessed only by
authorised users. It's basically just about preventing data from being
anonymously accessed, because, for some data providers, it is important
to know who downloads / uses the data.
This is quite common for research data, but there are also quite a few
examples from the public sector.
A first issue here is that, usually, this compulsory registration does
not result in clear benefits from the end users' side, who may be
reasonably concerned to provide personal information - that, in many
cases, is not limited to your email address, but you're also asked to
say which is you real name, the organisation you're working for, etc.
To address this, a recommendation to data providers could be: if you
require users to register / authenticate to get to the data, you should
explain (a) why, (b) how their personal information will be used, and
(c) which are the benefits (if any) they can get (e.g., they will be
allowed to submit feedback, they will be updated about data they're
interesting in).
I think this could be addressed by extending BP22 accordingly ("Provide
an explanation for data that is not available").
Another issue is that, although these data are open to everyone, the
need to authenticate creates a barrier to machine-based data access.
This can be addressed by supporting Web-based authentication /
authorisation protocols, but this is usually not the case.
Of course, this applies as well to data subject to access control.
Maybe, BP23 ("Make data available through an API") could be extended to
mention that, whenever direct data access is prevented, data providers
should support standard authentication / authorisation APIs.
----
Received on Tuesday, 31 May 2016 09:19:34 UTC