Re: URI persistence Was: Use Case: BetaNYC 3/5

Hello Makx,

Sorry for delaying this discussion...


> What do you mean with “a shortage of unique identifiers”? I understand
> that the identifier space for telephone numbers is limited but in case of
> http-based URIs, I don’t see the need to re-assign the same identifier more
> than once. In my mind, we should indeed state as best practice no to
> re-assign identifiers. And then, yes, we can’t enforce good practice – but
> at least we could explain that re-assigning identifiers makes a big mess of
> things.
>
I mean that users already using some kind of unique identifiers internally
will be very tempted (or maybe even be advised by us ?) to use these to
construct resource identifiers. This as several advantages among which a
smooth transition from "legacy data" and easy link-ability but does put the
same constraints on the HTTP identifier as there was on the other one. For
instance, in the Netherlands the 4 digits and two letters postcode points
you to any street in the country (luckily it's a small one!), turning my
postcode 1055HB into "http://data.overheid.nl/postcode/1055HB" will not
increase the size of this the addressing space. It could be envisioned that
the new postcodes because URI based but we can more reasonably count on a
traditional ID being mapped to LOD objects with, say, and relational-to-LOD
DB mapper.
The person from GS1 will correct me if I'm wrong but I suspect that the
product bar-code also has a limited addressing space, the identification
given to cars too, and some more examples could certainly be found.
In general it seems to me individuals that put together this kind of unique
identifiers wanted to have something intuitive to decode and to control. I
haven't seen any system using generated UUID for addressing something in a
unique way (except YAGO)... So my point is that data owners using some kind
of internal unique ID will most likely bind this ID to a domain name and
thus restrict the naming space of the LOD resources it publishes. It will
potentially tricky to discuss why re-using identifiers is not that messy on
datasets that evolve and why this because so bad when the dataset becomes
resources linked on the Web.

 On the guarantee issue, my experience in talking to data providers is that
> no-one seems to be willing to make such strong statements. In general, the
> statement would be “as long as we have money to sustain it” or (in
> commercial cases) “as long as we can make money off of it”. I think that
> the best we, as a group, can do is to recommend that data publishers at
> least consider the question and think beyond the immediate future.
>
+1 !


> Depending on what stuff it  is, they may need to think about the legal
> requirements and the expectations of the users of the data, and ideally
> they should design some form of policy for the time that they themselves
> can no longer maintain the material – which may or may not involve handing
> it over to someone else (e.g. some public or private archive).
>
+1 too. I'd be happy if we could suggest them to think about it and make it
possible for them to express the outcome of these thoughts, preferably in a
machine readable way.

Christophe

-- 
Onderzoeker
+31(0)6 14576494
christophe.gueret@dans.knaw.nl

*Data Archiving and Networked Services (DANS)*

DANS bevordert duurzame toegang tot digitale onderzoeksgegevens. Kijk op
www.dans.knaw.nl voor meer informatie. DANS is een instituut van KNAW en
NWO.


Let op, per 1 januari hebben we een nieuw adres:

DANS | Anna van Saksenlaan 51 | 2593 HW Den Haag | Postbus 93067 | 2509 AB
Den Haag | +31 70 349 44 50 | info@dans.knaw.nl <info@dans.kn> |
www.dans.knaw.nl


*Let's build a World Wide Semantic Web!*
http://worldwidesemanticweb.org/

*e-Humanities Group (KNAW)*
[image: eHumanities] <http://www.ehumanities.nl/>

Received on Monday, 24 March 2014 11:09:22 UTC