Re: A better solution for legacy IDs?

On Tue, Dec 13, 2011 at 03:01:08PM +0000, William Waites wrote:
> I kind of want to say that this shouldn't be shoehorned into a
> non-resolvable URI but instead should be a datatype. It's a special
> string. So,
> 
> "0915145537"^^xyz:isbn
> 
> then you can just use dc:identifier...
> 
> Maybe not so obvious with bibliographic identifiers, but with some
> other kinds of literals (e.g. weights and measures - thanks mmmmmrob)
> it starts seeming quite strange to put what is really a datatype into
> the meaning of the predicate...

And Karen wrote:
> but I wonder if the answer wouldn't have been different if you had
> said that you would have many thousands of different identifiers.

A historical note... Ten years ago, the DCMI Usage Board discussed a mechanism
for allowing the general public to propose "encoding schemes" for identifying
controlled vocabularies of values or "syntactic" encoding schemes (see Latest
Version at [1]).  At the time, the notion of "encoding scheme" had not yet been
clearly differentiated into "vocabulary encoding schemes" and datatypes.

The idea was to enable people to "qualify" string values for, say, Subject,
with a reference to the particular subject headings from which the string value
had been taken.  The idea was motivated primarily by a desire to qualify values
for dc:subject but was in principle applicable to any other property, e.g.,
dc:identifier.

Discussion simmered for two full years before the idea was finally abandoned
for many reasons, among which (in my recollection): 

-- The long-term maintenance burden of keeping a growing database up-to-date,
   especially for a lightweight organization such as DCMI.

-- Questions around whether to coin encoding schemes for specific versions
   of a controlled vocabulary or syntax scheme, and if so, how the encoding 
   schemes for specific versions should relate to each other.  Also, who 
   should judge whether a new version of a controlled vocabulary should trigger
   the creation of a new encoding scheme?

-- Questions around how to version individual encoding schemes themselves as the 
   associated links and contact information were updated.

-- Doubts about the etiquette or even legality of assigning encoding schemes
   to denote other peoples' intellectual property or brands.

-- The sustainability of persistence policies, process for resolving name
   clashes or change requests, etc...

Given the diversity of identifier schemes, it strikes me that regardless of
whether one were to solve the problem today by coining properties (e.g., as
subproperties of dc:identifier) or by coining datatypes, some of the issues
w.r.t. versioning, persistence, and maintainability might be the same today.

Tom

[1] http://dublincore.org/usage/documents/2003/05/15/vocabulary-guidelines/

-- 
Tom Baker <tom@tombaker.org>

Received on Wednesday, 14 December 2011 21:14:11 UTC