Re: ISSUE: Governmental data servers

Hi,

On Thu, Sep 4, 2014 at 2:52 PM,  <Manuel.CARRASCO-BENITEZ@ec.europa.eu> wrote:
> Dear Leigh,
>
> As a WG for Data on the Web, there should be some guidelines on what data should be published "official data server", though the same contact might be available in other servers.
>
> The domain "europa.eu" states that it is an official European Union web site. So,  it is related to reliability, timeliness and provenance.
>
> The question of collection vs. third level domain is *not* academic: it is very much a real and current issue.

There are issues around domain name management that relate to data
publication, but I don't think a "data server" vs "service server"
split helps. Ideally I would suggest that all data and content would
be available under a single domain. In the UK that might be gov.uk.
But for historical reasons there is already data available from, e.g.
data.gov.uk and also a number of {sector}.data.gov.uk domains. So in
practice information may be split across different domains for any
number of reasons, both practice and historical.

For the purposes of data publishing and re-use, the domain at which a
dataset or a distributed can be downloaded, or at which Linked Data
URIs are minted, is largely irrelevant. The URIs should be opaque. I
agree that using an authoritative domain helps to identify the source
of that information, but its only a relatively weak indicator compared
to, say, explicit metadata about a dataset that indicates it contains
provisional or otherwise inaccurate data. Or data acquired from
another source (e.g. data downloaded from a web archive).

So I don't think attempting to define two types of server and then
recommend that certain types of information appear in one location or
another really helps. In my experience if machine-readable data and
the equivalent human-readable content & functionality is available
from the same location, then things are often much easier for both
audiences.

The issue I've seen around domain name, which I think the WG should
address relates to how a government (but potentially any large,
distributed organisation) handles dividing up a domain name in order
to defer responsibility for portions of that URI space to parts of
that organisation.

The UK public sector URI guidelines encourage a {sector}.data.gov.uk
approach where one organisations manages the space, but any number of
other organisations or service providers may be granted rights to
publish services and data within that sub-domain. This avoids having
organisation specific sub-domains which are more prone to change, but
still allows organisations to take responsibility for certain datasets
or URI patterns. The Australian government has adopted a similar
convention, as its proven to work.

However an organisation like the BBC has prefered to adopt a path
based approach (bbc.co.uk/{sector}) as that better fits its goals to
manage its URI space and internal project management  & deployment. So
there isn't a one-size fits all approach.

That's the kind of best practice that I think needs to be generalised
and documented. Its also my issue with the COMURI specification as it
chooses (somewhat arbitrarily I feel) a specific set of practices for
URI construction, rather than acknowledging that there are several
approaches which may have their own merits.

Cheers,

L.

-- 
Leigh Dodds
Freelance Technologist
Open Data, Linked Data Geek
t: @ldodds
w: ldodds.com
e: leigh@ldodds.com

Received on Thursday, 4 September 2014 14:24:05 UTC