Re: [rdf] Re: URIs from William Bug on 2006-06-19 (public-semweb-lifesci@w3.org from June 2006)

From: William Bug <William.Bug@DrexelMed.edu>
Date: Mon, 19 Jun 2006 11:19:38 -0400
To: John Madden <john.madden@duke.edu>
Cc: Alan Ruttenberg <alanruttenberg@gmail.com>, w3c semweb hcls <public-semweb-lifesci@w3.org>
Message-Id: <D5592723-3C3D-4484-AD38-97719B9D926D@DrexelMed.edu>
I think this is an excellent reference to work from, when dealing  
with the issue of URIs in RDF generation & processing.

As I have always seen it (this is admittedly a the view of an RDF  
naif), DOIs and LSIDs both seek to fulfill the role one would expect  
to be played by URIs in the STM literature and biomedical object  
domains, respectively.

For those who had the chance to read the paper, I would specifically  
point to the discussion of the CrossRef & OpenURL projects.  Both  
relate to how you resolve a DOI tied to very practical Use Cases.   
One is very much focussed on the commercial issue of dis-ambiguating  
which journals a given library system has a subscription to.  The  
goal for this (OpenURL - http://www.exlibrisgroup.com/ 
sfx_openurl.htm) was to create an infrastructure for publishers (and  
aggregators) to resolve this issue in a way that is transparent for  
the user as they click on a link to an article (HTML or PDF).  The  
SFX system many may be familiar with seeing in the search engines  
hosted by their library systems.

CrossRef (http://www.crossref.org/) is more designed to address the  
core issue on the article of how you both maintain stable pointers to  
inherently unstable online resources, and also providing a URI-like  
generic resource pointer which can be resolved to the actual resource  
the moment a reader clicks on the reference in a bibliography.   
CrossRef is much more focussed on dealing with the many different  
scenarios related to the latter task and coming up with a way that -  
again - from the user's point transparently gets them to the correct  
resource.  CrossRef the organization seems to pitch themselves as the  
service designed to de-reference DOIs - which obviously makes the  
work they've done very relevant to this conversation.

Clearly, both of these issues are ones the folks from BMC & PLoS can  
give us some very practical insight into.

The one major project related to the topic in the article that the  
author seemed to neglect is the Internet Archive (http:// 
archive.org).  This is a long standing project (in Internet time,  
anyway - going back to 1996).  They trawl the entire public net and  
backup it up as often as possible.  They have massive, robotics-based  
tape drive systems working round the clock.  The original archive  
took almost a year to crawl the entire "public" net (it still takes  
about 2 months to cover everything, though there is a lot of effort  
they've put into to categorizing the rate at which contents changes -  
with content having a more rapid turnover getting more frequent  
observation).  After the end of the 1996 presidential campaign -  
within weeks, the only source for historians to analyze use of the  
web in the election was the Internet Archive.  This has continued to  
be the case for many research projects focussed on the use or and  
evolution of web content.  The IA has set up to donate periodic dumps  
to the Library of Congress.  They're technology has greatly improved  
over the years (they now have PetaByte storage racks and much a much  
more mature software layer).  Though IA doesn't solve the issue of  
the "hidden"/dynamic web all that much better than the other search  
engines (which is the space in which most if not all scientific  
literature lives), they clearly provide a great utility to difficult  
to manage mess the HTML web often devolves into.  IA is also  
intimately involved in the discussions in the library science  
community on this issue of digital reference resolution and  
archiving, as well as the critical issue of FIXING IP law - very much  
aligned with the efforts of the Creative Commons.  Some CC folks are  
also directly involved with IA.  IA has set up a specific group to  
help researchers make better use of IA content (http:// 
www.archive.org/web/researcher/researcher.php).

Cheers,
Bill

On Jun 18, 2006, at 12:52 PM, John Madden wrote:

>
> Alan et al,
>
> Wow, great topic. I'll need to get my thoughts together on this.
>
> Meanwhile, operationally what a uri "means" is clearly related to  
> the question of its (non)persistence. I recently found a wonderful  
> historical review of this topic from the point of view of a library  
> scientist. The group might enjoy it:
>
> 	http://www.aallnet.org/products/pub_llj_v97n04/2005-42.pdf
>
> John
>
>
> On Jun 18, 2006, at 1234, Alan Ruttenberg wrote:
>
>>
>> [It was on this list: http://lists.w3.org/Archives/Public/public- 
>> semweb-lifesci/2006Jun/0149]
>> -Alan'
>> On Jun 18, 2006, at 12:20 PM, John Madden wrote:
>>
>>>
>>> I can't locate the beginning of this thread. Did the discussion  
>>> start on another list?
>>> Thanks.
>>> John
>>>
>>> On Jun 17, 2006, at 1708, Eric Neumann wrote:
>>>
>>>>
>>>>
>>>> This is a very useful and important discussion thread, and I  
>>>> would like to see others on the list to contribute their  
>>>> thoughts/concerns as well.
>>>>
>>>> May I ask all the contributors to include HTML links to any  
>>>> acronyms they reference (e.g., NAPTR)? This will make it easier  
>>>> for the rest of us to catch up quickly, and to eventually  
>>>> collect the approaches out there into a comprehensive list of  
>>>> viable implementations.
>>>>
>>>> thanks,
>>>> Eric
>>>>
>>>>
>>>> --- Sean Martin <sjmm@us.ibm.com> wrote:
>>>>
>>>> > MW>
>>>> > MW> I believe this SRV-redirection behaviour is part
>>>> > of the LSID spec, and
>>>> > MW> we use it for all of the BioMOBY LSIDs...
>>>> > MW>
>>>> >
>>>> > It also uses NAPTR's as described in IETF RFC's
>>>> > 3401->3405 to traverse the
>>>> > URN namespace, allowing the dereferencing process to
>>>> > bridge the gap that
>>>> > separates authority name strings from service
>>>> > locations. From what I
>>>> > recall, the URN specs specifically do not permit
>>>> > names and locations to be
>>>> > confounded.
>>>> >
>>>> > Kindest regards, Sean
>>>> >
>>>> > --
>>>> > Sean Martin
>>>> > IBM Corp.
>>>> >
>>>> > public-semweb-lifesci-request@w3.org wrote on
>>>> > 06/16/2006 12:59:12 PM:
>>>> >
>>>> > >
>>>> > > On Fri, 2006-06-16 at 10:41 -0400, Alan Ruttenberg
>>>> > wrote:
>>>> > >
>>>> > > > something, but as far as I can see, the only
>>>> > authority related to
>>>> > > > namespaces in URLs is the DNS, and while there
>>>> > is the SRV field which
>>>> > > > might be used to direct someone to information
>>>> > about the namespace, I
>>>> > > > don't know whether anyone does.
>>>> > >
>>>> > >
>>>> > > I believe this SRV-redirection behaviour is part
>>>> > of the LSID spec, and
>>>> > > we use it for all of the BioMOBY LSIDs...
>>>> > >
>>>> > > M
>>>> > >
>>>> > >
>>>> > >
>>>> > > --
>>>> > >
>>>> > > --
>>>> > > Mark Wilkinson
>>>> > > Asst. Professor, Dept. of Medical Genetics
>>>> > > University of British Columbia
>>>> > > PI in Bioinformatics, iCAPTURE Centre
>>>> > > St. Paul's Hospital, Rm. 166, 1081 Burrard St.
>>>> > > Vancouver, BC, V6Z 1Y6
>>>> > > tel: 604 682 2344 x62129
>>>> > > fax: 604 806 9274
>>>> > >
>>>> > > "For most of this century we have viewed
>>>> > communications as a conduit,
>>>> > >        a pipe between physical locations on the
>>>> > planet.
>>>> > > What's happened now is that the conduit has become
>>>> > so big and
>>>> > interesting
>>>> > >       that communication has become more than a
>>>> > conduit,
>>>> > >        it has become a destination in its own
>>>> > right..."
>>>> > >
>>>> > >                 Paul Saffo - Director, Institute
>>>> > for the Future
>>>> > >
>>>> > >
>>>> >
>>>>
>>>> Eric Neumann, PhD
>>>> co-chair, W3C Healthcare and Life Sciences,
>>>> and Senior Director Product Strategy
>>>> Teranode Corporation
>>>> 83 South King Street, Suite 800
>>>> Seattle, WA 98104
>>>> +1 (781)856-9132
>>>> www.teranode.com
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
>

Bill Bug
Senior Analyst/Ontological Engineer

Laboratory for Bioimaging  & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA    19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - William.Bug@DrexelMed.edu







This email and any accompanying attachments are confidential. 
This information is intended solely for the use of the individual 
to whom it is addressed. Any review, disclosure, copying, 
distribution, or use of this email communication by others is strictly 
prohibited. If you are not the intended recipient please notify us 
immediately by returning this message to the sender and delete 
all copies. Thank you for your cooperation.
Received on Monday, 19 June 2006 15:19:50 UTC