- From: Jack Park <jack.park@sri.com>
- Date: Mon, 31 Jul 2006 10:52:04 -0700
- To: "'public-semweb-lifesci'" <public-semweb-lifesci@w3.org>
Let me toss out a few ideas (possibly longish - sorry). These thoughts might appear somewhat like the famous Larson cartoon where Joe is speaking to his dog Bowser: What Joe said: "Bowser, I'm going to toss a bone, then you go fetch the bone" What Bowser heard: "Bowser!!!!bone!!!!!bone" Mostly, this is what I heard, followed by some interpretations, comments and questions. What I heard: Important issues related to LSID Managing distributed objects (not just browsing) Decoupling identity from location (whatever "location" means) Versioning is important Not universally used Approaches that appear to be used: version numbers on same LSID string change LSID string Understand aspects of ARK that could apply to or augment LSID coupling to HTTP Comments about ARK: allows location independent identification web-accessible standard alternative to LSID Dereferencing (huge interest here) DDDS and other mechanisms (which I interpret to mean [1]) Data/Metadata room for confusion here one comment: LSID doesn't link to metadata one comment: ARK puts metadata in URI Persistence Interesting comment (Sean?) Datastore organized in name graphs version numbers used to determine which name graph to view I'll stop there from the scribe part: that's tough for me trying to keep accurate notes in a candy store; too many great ideas flying past and my brain is always busy trying to disambiguate, embrace and extend everything flying by. I must ask this question regarding the nature of an LSID. Is it not an identifier? I've heard people use it as a "name". If our esteemed colleague, whom we identify with his email address, loses a finger, do we change his identity? Do we change our names when we change any attribute useful in identification of ourselves? Why would we ever change an LSID just because some aspect of the identity of the particular subject ever changed? Certainly we would change whatever metadata is involved in the records of our identity, but we don't change our identity simply because the subject itself did not change. Version numbers make a lot of sense. So does a "name graph" that relates an identifier to the properties that constitute the object, and version numbers that tie identifiers to the most-recent properties (attributes, sorry) makes sense. And, in the spirit of W3C documents, where the catalog points to the most-recent version but still gives links to prior versions, should that not be the standard way to deal with LSIDs, no matter how they come to be constituted? I believe it was Carole, in one of her many informative opportunities to talk, that said words to this effect (one of here two use cases): "we use LSIDs in our database as we gather data from the web *if LSIDs are available*". Those are not her precise words, and I apologize if they are wrong to boot, and the emphasis is mine alone. I am very interested in that aspect: other researchers may or may not be assigning LSIDs to their data, but we must gain access to that data in any case. How do we do that? Another aspect of versioning that came up was that of authority. If we step outside the context of this thread (outside the box, so to speak) and look around, this question comes up often just about everywhere. Software development comes to mind. The apache foundation has this notion of "committers". A commiter is one who has been given the authority to make changes in the version controlled source code for a project. Apache foundation runs on a meritocracy, where people are elected to committer status after showing appropriate skills, etc. Maybe that doesn't apply to life sciences research, but what could be imported from such ideas into LSID? What I got: I came away from the largest picture of the discussion that there is a need for means by which all forms of identifiers can be generated and used by all. As Carole said, they are used if they are available. Whether they start with something that makes sense in an HTTP environment may or may not be as important. As Dan suggested, if you can "rot13" (rotate the characters that make the string) and the identifier is still useful, then it shouldn't matter how the string is constructed; it does matter that the string is, at once, available, and findable. After all, an LSID, an ARK, a PSI (to the topic mappers) is a shortcut, a one string fits all (for those who know it) identifier of some object, concept, subject, whatever you want to call it. LSID happens to be the solution adopted by the life sciences community. However... There is this subject that crept up in the previous couple of decades known as "psychoneuroimmunology." That's what you get when the psychologist start collaborating with neurologists who are also collaborating with immunologists. I feel comfortable in predicting that such collaborations will move in directions that include non-lifescience workers, who are not familiar with LSIDs and who will need to use them. That argues for separation of properties which identify objects from the shortcuts we invent to identify those objects. The properties (attributes, sorry) still prevail. If I lose a finger in an accident, I'm still me. If we happened to agree that collecting object identity properties and associating those with identifier shortcuts in a way that render them searchable, say, using rdf triples on the web, then we are a step closer to allowing even google to help us identify our objects. This, I believe, is important to the larger picture of federating research efforts among heterogeneous work groups everywhere. Doing so means that we can then include identifiers on the web in numerous ways, making them ubiquitous, no matter how they are constructed. Note: by saying that, I am not advocating ad hoc fabrications; I strongly believe that projects like LSID, ARK, even PSIs, are important and warrant the efforts of standardization. If one were to imagine a public information commons, one supported by several kinds of entities, including NLM, NSF, and even aspects of the philanthropic universe, then one could imagine federating all the working ontologies we use together with the identifiers associated with the objects represented in those ontologies. I tend to think that a federation of subject maps would suit global collaboration and encourage greater and more standardized use of identifiers such as LSID. In my view, a subject map, among other things, is a "name graph." That's a bit more than a half EURO for the day. Cheers, Jack [1] http://www.ietf.org/rfc/rfc3401.txt
Received on Monday, 31 July 2006 17:52:18 UTC