RE: [BioRDF] Taxonomic Databases Working Group and LSIDs

[Delayed follow-up]

Donald,

Thanks for your thoughtful responses.  More comments/questions below.

> From: Donald Hobern [mailto:dhobern@gbif.org] 
. . .
> Probably the greatest concern with the approach you describe is a 
> sociological one.  If we adopt any system consisting simply of HTTP 
> URIs, there will be a tendency for issuers simply to put forward 
> existing URLs and declare them to be GUIDs without making the 
> effort to 
> consider permanence, etc.  We felt therefore that having at 
> least some 
> distinction between a GUID and "normal" URLs was important.  Issuing 
> LSIDs is not a major hurdle but it may be enough for this purpose.

If you want a recognizable syntactic distinction then you could advise
issuers to use a specific http prefix such as "http://lsid.tdwg.org?".
Do you think such guidance would lead to less compliance than telling
them to use a "urn:lsid:" prefix?  

> 
> It was also felt that fully decentralised ownership of the 
> identifiers 
> was a good thing, rather than linking them all explicitly 
> e.g. to some 
> proxy at http://lsid.tdwg.org.  Many institutions value their 
> independence.

The ability to dereference an http URI directly via http does create a
dependency on the organization that owns the domain of that URI.  This
is unavoidable.  But does this really need to be a show stopper?  I see
two ways to overcome this:

1. Decide that this dependency is minimal and acceptable: merely
forwarding the request, much like purl.org does.  After all, if the
overall purpose of these identifiers is for a community to ensure shared
meaning of these identifiers, surely that community could somehow create
an organization that is trustworthy enough to administer a web server
for this purpose.

2. Each institution could define its own http prefix.  This means that
software would have to recognize many specialized http prefixes instead
of just "http://lsid.tdwg.org?", for example.  But this is quite
feasible, because each institution could provide metadata that declares
the prefix that it uses, so that when software first encounters a URI
that uses a previously unknown prefix, dereferencing that URI could
yield metadata that declares the prefix.  For example, if the MyOrg
institution uses "http://lsid.myorg.example?" as an prefix, then
dereferencing a URI of the form http://lsid.myorg.example?foo could
yield metadata that indicates that any URI of the form
http://lsid.myorg.example?foo has the same meaning as the URI
urn:lsid:foo .

Would either of these approaches seem reasonable?  If not, why not?

> 
> Clearly, as you describe, it is quite possible to layer additional 
> restrictions and semantics on standard URIs to support any 
> functionality 
> desired.  In our case we saw greater benefits in not having to define 
> all of these things from scratch.

Can you elaborate on what you mean?  Surely defining and implementing a
new URN subscheme from scratch is more work than layering restrictions
and semantics on standard http URIs.  On the other hand, given that the
LSID subscheme has already been defined, the layering on http URIs could
be done trivially, merely by referencing the LSID specification.  

> 
> I'm not sure if this properly responds to your question, but 
> these were the factors that made a difference in our case.
> 
> Many thanks,
> 
> Donald
> 
> ------------------------------------------------------------
> Donald Hobern (dhobern@gbif.org)
> Deputy Director for Informatics 
> Global Biodiversity Information Facility Secretariat 
> Universitetsparken 15, DK-2100 Copenhagen, Denmark
> Tel: +45-35321483   Mobile: +45-28751483   Fax: +45-35321480
> ------------------------------------------------------------
> 
> 
> 
> Booth, David (HP Software - Boston) wrote:
> > Donald,
> >  
> > For the most part, http URIs can be designed (using specialized
> > prefixes) to provide all the benefits of any new URI scheme or URN
> > sub-scheme, plus more.  For example, a specialized http URI 
> prefix such
> > as "http://lsid.tdwg.org? <http://lsid.tdwg.org?> " could be
> > functionally equivalent to the prefix "urn:lsid:" that 
> would otherwise
> > begin an LSID URI.  Software that is programmed to recognize the
> > "urn:lsid:" prefix and apply the LSID resolution mechanism 
> could instead
> > recognize the "http://lsid.tdwg.org? 
> <http://lsid.tdwg.org/?> " prefix
> > and apply the LSID resolution mechanism.  Was this kind of approach
> > considered?  If so, why was it deemed inadequate?
> >  
> > For more details, see my paper on "Converting New URI 
> Schemeds or URN
> > Sub-Schemes to HTTP" at
> > http://dbooth.org/2006/urn2http/ 
> <http://dbooth.org/2006/urn2http/>  .
> > See also the TAG's draft finding on "URNs, Namespaces and 
> Registries" at
> > http://www.w3.org/2001/tag/doc/URNsAndRegistries-50 . 
> >  
> > Thanks
> >
> > David Booth, Ph.D.
> > HP Software
> > dbooth@hp.com
> > Phone: +1 617 629 8881
> >   
> >
> >  
> >
> >
> > ________________________________
> >
> > 	From: public-semweb-lifesci-request@w3.org
> > [mailto:public-semweb-lifesci-request@w3.org] On Behalf Of 
> Donald Hobern
> > 	Sent: Tuesday, August 29, 2006 9:15 AM
> > 	To: Eric Neumann
> > 	Cc: public-semweb-lifesci hcls
> > 	Subject: Re: [BioRDF] Taxonomic Databases Working Group and
> > LSIDs
> > 	
> > 	
> > 	Dear Eric,
> > 	
> > 	Thank you for mentioning TDWG's adoption of LSIDs.  The
> > Taxonomic Databases Working Group (http://www.tdwg.org/) is an
> > international association focused on developing 
> collaboration between
> > biological database projects.  Its focus is primarily on 
> whole-organism
> > data (natural history collections, herbaria, field observations,
> > identification tools, etc.) and taxonomic information (the 
> name does not
> > adequately reflect the breadth of its interests).
> > 	
> > 	Up to now, TDWG has developed models for data exchange using XML
> > Schema and has had no reliable mechanisms for cross-referencing data
> > objects between different resources.  A 30-month project is 
> under way to
> > revise the organisation's processes and architecture (funded by the
> > Gordon and Betty Moore Foundation).  Part of this work has been to
> > examine technological options for using globally unique identifiers
> > within TDWG data standards.  Two workshops were held 
> earlier this year
> > to consider possible options (including LSID, DOI, ARK and 
> PURL).  Our
> > conclusion was that LSID best suited our requirements.  The reasons
> > included:
> > 	
> >
> > 	*	LSIDs provide an existing standard approach for
> > retrieving data and metadata (this would need to be defined 
> e.g. for a
> > PURL-based approach) 
> > 	*	LSIDs can be safely assigned to permanent objects and
> > potentially remain available indefinitely for dereferencing 
> > 		
> > 	*	LSIDs can be issued and resolved by any organisation
> > without any requirement for a central LSID authority (this 
> egalitarian
> > approach suited the community better than the model adopted 
> e.g. by DOI)
> >
> > 	*	There is no special cost associated with issuing large
> > numbers of LSIDs, even for temporary data objects (in contrast again
> > with e.g. DOI)
> > 		
> > 	*	LSIDs are clearly not just URLs - we perceived social
> > benefits in requiring issuers to think about what they were doing
> > (rather than just using existing URLs)
> > 		
> > 	*	LSIDs mesh perfectly with a recognised need in TDWG to
> > move away from modeling with XML Schema to adopt RDF-based models 
> >
> > 	Our focus right now is to develop best practices for the use of
> > LSIDs for scientific names and for specimens in natural history
> > collections.  We have a number of activities under way to 
> develop new
> > LSID software components (a .NET version of the LSID stack, native
> > handling of LSID requests in TDWG tools for data sharing).
> > 	
> > 	More information can be found at:
> > http://wiki.gbif.org/guidwiki/wikka.php
> > 	
> > 	Many thanks,
> > 	
> > 	Donald
> > 	
> > 	------------------------------------------------------------
> > 	Donald Hobern (dhobern@gbif.org)
> > 	Deputy Director for Informatics 
> > 	Global Biodiversity Information Facility Secretariat 
> > 	Universitetsparken 15, DK-2100 Copenhagen, Denmark
> > 	Tel: +45-35321483   Mobile: +45-28751483   Fax: +45-35321480
> > 	------------------------------------------------------------
> >
> > 	Eric Newmann wrote:
> > 	
> >
> > 			I would like to point out the Taxonomic
> > Databases Working Group (TDWG) 
> > 			and their work with trying to establish a system
> > of Global Unique 
> > 			Identifiers (GUIDs).
> > 			
> > 	
> > http://wiki.gbif.org/guidwiki/wikka.php?wakka=GUID2Report
> > 			
> > 			At this point in time they are recommending
> > (within their community) 
> > 			the use of LSIDs WITH metadata in the form of
> > RDF.
> > 			
> > 			I would like to propose that we include this on
> > the list of examples 
> > 			for the LSID/URI discussion in BioRDF (just
> > added to 
> > 	
> > 
> http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Tasks/URI_Best_Practice
> > 			s/LSID _Pros_%26_Cons). I think they have some
> > great global examples 
> > 			of how to use such identifiers.
> > 			
> > 			Eric
> > 			
> > 			Eric Neumann, PhD
> > 			co-chair, W3C Healthcare and Life Sciences,
> > 			and Senior Director Product Strategy
> > 			Teranode Corporation
> > 			83 South King Street, Suite 800
> > 			Seattle, WA 98104
> > 			+1 (781)856-9132
> > 			www.teranode.com
> > 			    
> >
> >
> >   
> 

Received on Wednesday, 27 September 2006 22:36:03 UTC