Re: Comments on "Designing URI Sets for the UK Public Sector"

On Thu, Nov 11, 2010 at 05:23:08PM -0500, Young,Jeff (OR) wrote:
> One of my action items from the F2F was to comment on the URI patterns
> and functionality described in the "Designing URI Sets for the UK Public
> Sector" document:

(Copying some of the relevant people at DGU, this message is in
reply to this one:
http://lists.w3.org/Archives/Public/public-lld/2010Nov/0111.html
Also note that Jeff's nicely formatted HTML table was stripped out
by both my mailreader and the W3C list archives, probably best
not to send HTML in email in general...)

Hi Jeff, replying to signal to the data.gov.uk folks that this 
discussion is happening and also because I have also run into the
issues you describe in practice. A good example is the CKAN -> RDF 
converter which has to follow the CO guidelines when runnign
against the http://data.gov.uk site, and conventions that are more
like what you are describing when running against http://ckan.net/

The CO guidelines are a year old now, and lots of people have 
done quite a lot of thinking on the problem in the past year.
I think your comments are well-timed, a year on might be a good
time to have a brief review of the guidelines.

Comments below are only on the differences between your suggestion
and OKF's practice.

> (Real world Document)
> 
> http://education.data.gov.uk/school/78
> 
> (Generic) Document
> 
> http://education.data.gov.uk/school/78/

I don't quite understand what a "Generic Document" is and the
difference between presence and absence of a slash is very slight
and likely to leat to confusion and bugs for people using the
data.

> (Web Document) Representation
> 
> http://education.data.gov.uk/school/78/doc.rdf

Why not just 78.rdf, 78.html, etc?

> Definition of the scheme concept
> 
> http://education.data.gov.uk/ontology/education/#School

The URI looks very strange. Obviously it is valid to have a
# immediately following a / but it still looks very strange.
And I don't see why ontology/education wouldn't be the
name of the ontology, with education.html being the human
readable documentation, and education.rdf being the machine
readable, and education#School being an identified fragment
in those docs.

> List of scheme identifiers
> 
> http://education.data.gov.uk/school/
> 
> Set
> 
> http://education.data.gov.uk/school

Again, this is a very big semantic difference for the presence
or absence of a / to signal. The way most people would
understand a trailing / is that it implies the string "index".
I realise this isn't RDF semantics but it is the behaviour
that everyone who has ever done any web development will 
expect. So why not school/schemes and school/all or something?
(along with schemes.html, schemes.rdf, all.html, all.rdf etc)?

> Because the resources aren't scattered over different top-level path
> segments, hackability is inherently improved. 

Agree++

> Also note that their URI pattern recognition for "(Web Document)
> Representation" depends on the trailing path segment starting with the
> letters "doc.". This is a serious limitation, IMO, caused by their
> willingness to stack concept/reference pairs in their URI. This
> limitation could be avoided by coining a formulaic or opaque token for
> the individual instead. (Roads and junctions have a nasty habit of
> changing "names" over time, so maybe opaque tokens would be better in
> these cases.)

This is clearly not a problem that it unique to DGU. The
problem with opaque identifiers is they don't make sense to
humans.

	http://ckan.net/package/statistics-data-gov-uk

is a lot better than

	http://ckan.net/package/b37a8465-e94f-4c84-95b9-dc3c2b2e1066

but the former as you rightly point out may change.

> Their stacked (Real world) Identifier:
> http://education.data.gov.uk/id/road/M5/junction/24
> 
> Formulaic alternative: http://education.data.gov.uk/junction/M5-24 

(s/education/transport/)

I agree your alternative here is more succinct and better for that
reason, but I'm not sure it solves the opaque and unreadbale vs.
plastic and memorable problem.

As I said, I think our approaches are very similar (modulo the
bit about the trailing /).

Cheers,
-w

Received on Friday, 12 November 2010 17:13:37 UTC