- From: Leigh Dodds <leigh@ldodds.com>
- Date: Thu, 4 Sep 2014 15:23:34 +0100
- To: Manuel CARRASCO-BENITEZ <Manuel.CARRASCO-BENITEZ@ec.europa.eu>
- Cc: public-dwbp-wg <public-dwbp-wg@w3.org>
Hi, On Thu, Sep 4, 2014 at 2:52 PM, <Manuel.CARRASCO-BENITEZ@ec.europa.eu> wrote: > Dear Leigh, > > As a WG for Data on the Web, there should be some guidelines on what data should be published "official data server", though the same contact might be available in other servers. > > The domain "europa.eu" states that it is an official European Union web site. So, it is related to reliability, timeliness and provenance. > > The question of collection vs. third level domain is *not* academic: it is very much a real and current issue. There are issues around domain name management that relate to data publication, but I don't think a "data server" vs "service server" split helps. Ideally I would suggest that all data and content would be available under a single domain. In the UK that might be gov.uk. But for historical reasons there is already data available from, e.g. data.gov.uk and also a number of {sector}.data.gov.uk domains. So in practice information may be split across different domains for any number of reasons, both practice and historical. For the purposes of data publishing and re-use, the domain at which a dataset or a distributed can be downloaded, or at which Linked Data URIs are minted, is largely irrelevant. The URIs should be opaque. I agree that using an authoritative domain helps to identify the source of that information, but its only a relatively weak indicator compared to, say, explicit metadata about a dataset that indicates it contains provisional or otherwise inaccurate data. Or data acquired from another source (e.g. data downloaded from a web archive). So I don't think attempting to define two types of server and then recommend that certain types of information appear in one location or another really helps. In my experience if machine-readable data and the equivalent human-readable content & functionality is available from the same location, then things are often much easier for both audiences. The issue I've seen around domain name, which I think the WG should address relates to how a government (but potentially any large, distributed organisation) handles dividing up a domain name in order to defer responsibility for portions of that URI space to parts of that organisation. The UK public sector URI guidelines encourage a {sector}.data.gov.uk approach where one organisations manages the space, but any number of other organisations or service providers may be granted rights to publish services and data within that sub-domain. This avoids having organisation specific sub-domains which are more prone to change, but still allows organisations to take responsibility for certain datasets or URI patterns. The Australian government has adopted a similar convention, as its proven to work. However an organisation like the BBC has prefered to adopt a path based approach (bbc.co.uk/{sector}) as that better fits its goals to manage its URI space and internal project management & deployment. So there isn't a one-size fits all approach. That's the kind of best practice that I think needs to be generalised and documented. Its also my issue with the COMURI specification as it chooses (somewhat arbitrarily I feel) a specific set of practices for URI construction, rather than acknowledging that there are several approaches which may have their own merits. Cheers, L. -- Leigh Dodds Freelance Technologist Open Data, Linked Data Geek t: @ldodds w: ldodds.com e: leigh@ldodds.com
Received on Thursday, 4 September 2014 14:24:05 UTC