- From: Michel_Dumontier <Michel_Dumontier@carleton.ca>
- Date: Wed, 11 Jul 2007 10:22:10 -0400
- To: public-semweb-lifesci <public-semweb-lifesci@w3.org>
- Cc: Mark Wilkinson <markw@illuminae.com>, Benjamin Good <goodb@interchange.ubc.ca>, Natalia Villanueva Rosales <naty.vr@gmail.com>
> On Jul 10, 2007, at 1:13 PM, Michel_Dumontier wrote: > > > The use of a location free identifier such as an LSID provides me with > > the capability to make statements about resources that I care about. > > LSIDs and URLs can live together just fine. Using owl:sameAs predicate > > to bind them together is one easy way of doing this. Just make sure > > you're talking about the same thing. > > What it doesn't do, is provide the courtesy that has been requested > by other semantic web practitioners, that, based on the identifier, > one can discover something about the resource by "following your nose". [Michel_Dumontier] Sure it does... I can make statements using unambiguous LSIDs and if this resource is equivalent to one identified by a URL, I can instruct my semantic web application to follow that URL. However, the suggestion that my semantic web application should read HTML, and follow its nose to find a REL LINK to an RDF document (potentially not the right one) is an interesting, and more complex resolution mechanism that requires more sophisticated knowledge about possible content presentation. > The cost of using an http identifier, and providing a 303 and a > pointer to more information, instead of using an LSID, seems a small > cost to satisfy this community. > [Michel_Dumontier] Ok - here's a use case to consider: I would like to transform third party data (unstructured / text file) into RDF/OWL, because they have no intention to make it available in that format at this time - which URI should I assign to the resources? Here are the things I need to consider 1) Many people may want to make statements about those resources and need stable, unchanging identifiers to do so. a) Imagine the problem of mapping multiple identifiers if everybody assigned their own URI! I'm not interested in recreating the identifier problem that has forever plagued bioinformatics. b) What if I have no intention of providing the content at a URL, but rather as a downloadable document? c) As an interesting aside, by what mechanism should statements made about the resource, but published at different locations, be retrieved? (I'm very interested in learning about this!). One option, maybe, is for data providers to register with a directory by providing the URL of the resource they resolve. 2) Chimezie and Jonathan suggest that we might use emerging (not yet W3C recommended) technologies to embed/extract/transform structured data. This might be plausible, but requires fairly sophisticated approaches to content management and application design, and requires standardization across data providers. Otherwise, the penalty for trying to figure out who does what and how, will be difficult and possibly overwhelming. Don't get me wrong, I like the idea of embedding more explicit semantics in HTML documents, but is this really the behavior we want for resources defined in non-HTML documents? > While you are correct about LSIDs and URLs being able to be bound > together using sameAs, I don't see why one would, in new designs, > choose to employ both. > [Michel_Dumontier] Hopefully you sympathize with my need to have unambiguous identifiers - the recent change of UniProt from LSID to URL clearly demonstrates the consequences of arbitrarily changing the identifier of a named resource. If anybody made statements using those LSIDs, they are no longer defined in RDF documents provided by UniProt. Resource identification and resource presentation are two really different things. -=Michel=-
Received on Wednesday, 11 July 2007 14:22:33 UTC