Re: A better solution for legacy IDs? from stu on 2011-12-13 (public-lld@w3.org from December 2011)

From: stu <stuart.weibel@gmail.com>
Date: Tue, 13 Dec 2011 08:17:35 -0800
To: William Waites <wwaites@tardis.ed.ac.uk>
Cc: pohl@hbz-nrw.de, public-lld@w3.org
Message-ID: <CAHMkD5=Gbo3Pp3XfEtpZhMNJ4ZS6ry-C5JbAgzuL0TQYik3nqw@mail.gmail.com>
The problem that Karen identifies has plagued us for a very long time.  It
was there before the Web, but was a rather small part of the architectural
challenge faced in our information supply chains.  The Identifiers worked
well enough within their particular domains that a universal solution was
not necessary.

The Web makes a solution all the more critical, but the political and
economic impediments to achieving it are, if anything, greater than ever.

The community can go ahead and cobble together 'solutions' with the digital
equivalent of chickenwire and chewing gum, but until publishers, national
libraries, vendors (well, at least one I can think of), identify a common
solution, the problem will get worse and be more difficult to solve over
time.

Karen identified the essential sticking point: the cost to maintaining the
rickety scaffolding of identity over time.

I wrote about this problem in a blog post last April:
http://weibel-lines.typepad.com/weibelines/2011/04/a-modest-proposal.html

The essence of that post is appended below for anyone yet unsated with the
argument.  The proposed solution may not be the best one.  I'd be
interested to hear alternatives

------
Text from my blog post:

There is growing awareness of the need for management of namespaces for
public vocabularies and for identification of authority data and other
bibliographic entities.  For every FRBR entity, an identifier. There are
three essential characteristics of such identifiers:

   1. Global uniqueness: a natural benefit of the Internet as a global file
   system.
   2. Persistence: a function of the commitment of the organizations.
   3. Canonical character: assets that have a single preferred identifier
   will aggregate with greater visibility, strengthening citability and the
   benefits of social bibliography.

The first attribute comes for free with the use of Internet protocols.  The
second has to do with the character of organizations, and no collection of
organizations is more likely to succeed at this than the global library
community (or has a stronger stake in that success).  The third
characteristic is the missing piece.  Libraries have paid insufficient
attention to identifier strategies, and the near-term business case for
collective action is weak, even as the long-term imperative is strong.
 Think of it as the *tragedy of the dot commons*.

National libraries lack a global mandate.  No software provider has
sufficient reach. *Open Library* has the right philosophical orientation,
but lacks standing and the promise of professionally-mediated persistence.
Google can achieve the goal, but a shareholder business-model is a mismatch
with the long-term social commitment required.  IFLA is a good forum for
consensus, but lacks appropriate operational capacity.  OCLC has the
critical seed data, but has insufficient business motivations to commit the
resources, and is not considered entirely neutral by some stakeholders.

Many respected library technologists simply accept that *Identifier
Babel*is a fact of a complicated environment, and the best that can be
hoped for
is identity mapping.  To accept this position is to accept a declining
visibility on the Web (already low) for the intellectual assets managed by
libraries, and over the long run, a commensurate decline in importance of
libraries in a born-digital world.

ICANN is on the threshold of a new process for the approval of generic Top
Level Domains (gTLDs), open to applications from any established public or
private organization. The application fee of $185,000 USD can be expected
to reduce inclinations towards gTLD squatting, but still there will be a
lot of interest in this new approach to Web identifiers.

The international library community should take advantage of this
opportunity to establish a business-neutral, canonical naming authority
that will assure that digital library users of the future will benefit from
the fixity of resources on a sound and persistent global scaffolding of
knowledge.

This goal can best be met through an alliance of stakeholders who share
identifier policies, infrastructure, and transparent governance, guided by
a common responsibility to the needs of users, now and into the future.




On Tue, Dec 13, 2011 at 7:01 AM, William Waites <wwaites@tardis.ed.ac.uk>wrote:

> On Tue, 13 Dec 2011 14:02:21 +0100, "Adrian Pohl" <pohl@hbz-nrw.de> said:
>
>    adrian> <info:0915145537>
>
> I kind of want to say that this shouldn't be shoehorned into a
> non-resolvable URI but instead should be a datatype. It's a special
> string. So,
>
> "0915145537"^^xyz:isbn
>
> then you can just use dc:identifier...
>
> Maybe not so obvious with bibliographic identifiers, but with some
> other kinds of literals (e.g. weights and measures - thanks mmmmmrob)
> it starts seeming quite strange to put what is really a datatype into
> the meaning of the predicate...
>
> Cheers,
> -w
> --
>               William Waites <wwaites@tardis.ed.ac.uk>
>  Visiting Researcher, Laboratory for Foundations of Computer Science
>            School of Informatics, University of Edinburgh
>
Received on Tuesday, 13 December 2011 16:18:13 UTC