- From: Leigh Dodds <leigh@ldodds.com>
- Date: Thu, 4 Sep 2014 15:23:34 +0100
- To: Manuel CARRASCO-BENITEZ <Manuel.CARRASCO-BENITEZ@ec.europa.eu>
- Cc: public-dwbp-wg <public-dwbp-wg@w3.org>
Hi,
On Thu, Sep 4, 2014 at 2:52 PM, <Manuel.CARRASCO-BENITEZ@ec.europa.eu> wrote:
> Dear Leigh,
>
> As a WG for Data on the Web, there should be some guidelines on what data should be published "official data server", though the same contact might be available in other servers.
>
> The domain "europa.eu" states that it is an official European Union web site. So, it is related to reliability, timeliness and provenance.
>
> The question of collection vs. third level domain is *not* academic: it is very much a real and current issue.
There are issues around domain name management that relate to data
publication, but I don't think a "data server" vs "service server"
split helps. Ideally I would suggest that all data and content would
be available under a single domain. In the UK that might be gov.uk.
But for historical reasons there is already data available from, e.g.
data.gov.uk and also a number of {sector}.data.gov.uk domains. So in
practice information may be split across different domains for any
number of reasons, both practice and historical.
For the purposes of data publishing and re-use, the domain at which a
dataset or a distributed can be downloaded, or at which Linked Data
URIs are minted, is largely irrelevant. The URIs should be opaque. I
agree that using an authoritative domain helps to identify the source
of that information, but its only a relatively weak indicator compared
to, say, explicit metadata about a dataset that indicates it contains
provisional or otherwise inaccurate data. Or data acquired from
another source (e.g. data downloaded from a web archive).
So I don't think attempting to define two types of server and then
recommend that certain types of information appear in one location or
another really helps. In my experience if machine-readable data and
the equivalent human-readable content & functionality is available
from the same location, then things are often much easier for both
audiences.
The issue I've seen around domain name, which I think the WG should
address relates to how a government (but potentially any large,
distributed organisation) handles dividing up a domain name in order
to defer responsibility for portions of that URI space to parts of
that organisation.
The UK public sector URI guidelines encourage a {sector}.data.gov.uk
approach where one organisations manages the space, but any number of
other organisations or service providers may be granted rights to
publish services and data within that sub-domain. This avoids having
organisation specific sub-domains which are more prone to change, but
still allows organisations to take responsibility for certain datasets
or URI patterns. The Australian government has adopted a similar
convention, as its proven to work.
However an organisation like the BBC has prefered to adopt a path
based approach (bbc.co.uk/{sector}) as that better fits its goals to
manage its URI space and internal project management & deployment. So
there isn't a one-size fits all approach.
That's the kind of best practice that I think needs to be generalised
and documented. Its also my issue with the COMURI specification as it
chooses (somewhat arbitrarily I feel) a specific set of practices for
URI construction, rather than acknowledging that there are several
approaches which may have their own merits.
Cheers,
L.
--
Leigh Dodds
Freelance Technologist
Open Data, Linked Data Geek
t: @ldodds
w: ldodds.com
e: leigh@ldodds.com
Received on Thursday, 4 September 2014 14:24:05 UTC