Re: BioRDF [Telcon]: slides for the UMLS presentation from William Bug on 2006-06-07 (public-semweb-lifesci@w3.org from June 2006)

From: William Bug <William.Bug@DrexelMed.edu>
Date: Wed, 7 Jun 2006 15:34:27 -0400
To: kc28 <kei.cheung@yale.edu>
Cc: public-semweb-lifesci@w3.org
Message-Id: <ED610163-4BE5-4C34-8AF0-1369F0DF315B@DrexelMed.edu>
I do run on, sometimes, don't I, Kei?

I emphatically agree with the general tenor of your suggestion.

I would word it a bit differently.

I wouldn't call this outreach so much as going to the "customer" and  
asking them to help us - the technology experts - to define their  
user requirements.  I would word it this way to the technologists, at  
least.  The Neuroscientists should be pitched using "civilian"  
colloquialisms, but the point is I believe the onus is on those  
developing and applying the technology to stay in sync with the needs  
of the neuroscientists.

I realize many of us on this list are in fact trained biomedical and/ 
or computer science researchers.  I myself was originally trained as  
a molecular biophysicist studying neuromodulation of presynaptic,  
Voltage-dependent, Ca++-channels using single-channel and whole-cell  
electrophysiological techniques.  That places us at the extremely  
valuable nexus where we possess specific insight into the information  
needs of broader community of neuroscientists we hope will benefit  
from the technological resources we develop, while also possessing  
the technological insight required to determine what is practical.

My sense is it's important to develop credibility on both sides of  
this equation - the technology developers need to clearly demonstrate  
they're sensitive the needs of "bleeding edge" researchers.  They are  
developing tools to revolutionize a scientist's ability to perform  
their research tasks effectively and efficiently - transform them  
from 19th century cottage scientists where all knowledge mining must  
be done laboriously and with very limited scope by their lonely brain  
into 21st century informaticists where large scale, data/knowledge  
mining against the evolving "World Brain" (H.G. Wells term - http:// 
sherlock.berkeley.edu/wells/world_brain.html) is a routine practice.

The scientists also need to demonstrate they recognize the value  
provided by the technologists.  This will again derive from clear  
demonstrations of the value the technological solutions can provide  
to the researcher.  This latter issue is often a hard one to get  
across, but its lack of such recognition/trust that can lead the  
technologists to go at it on their own out of frustration (Kei, Don,  
and others who attended the Human Brain Project meeting in April can  
attest to the fact that I am just as subject to this frustration as  
any other bioinformatics developer - :-)  ).

Along these lines, I'd suggest:

1) Presentations by neuroscientists who have done seminal work in  
neuroinformatics:
	I think Kei's suggestion is an excellent.  However, I'd suggest a  
F2F meeting, where these folks are invited as speakers.  It will be  
hard to get the full effect of what they have to say on a phone or  
video conference.  They are likely to take a talk at a meeting more  
seriously and a greater level of commitment is likely to derive from it.
	I would suggest there be a session of neuroinformatics presentations  
by neuroscientists, and also a session of semantic web technology  
presentations by participants of this group.  The focus should be on  
neuroinformatics projects using semantic web technology with one  
intro talk on semantic web technology applied to biomedical  
informatics to provide a context for those neuroscientists who've not  
yet got the take home message.
	My suggestion for neuroscientists would be - in no particular order  
of importance:
		1) Gordon Shepherd (SenseLab) - integration of various modalities  
of neuro-data with a focus on the olfactory system
		2) Doug Bowden (NeuroNames) - unified, mammalian neuroanatomical  
lexicon
		3) Maryann Martone (CCDB, SMART Atlas, & BIRN) /Mark Ellisman  
(BIRN)/ Jeff Grethe (BIRN infrastructure) - broad-field, neuroimaging- 
centric neuroinformatics infrastructure
		4) Rolf Kütter (CoCoMac) - literature informatics ("bibliomics")  
system with a focus on neuro-connectivity
		5) Rob Williams (GeneNetwork/WebQTL/Mouse Brain Library) - genetic  
variability and brain phenotypes from molecules through anatomy and  
behavior
		6) Peter Hunter (CellML and parametric spatial modeling of the brain)
		6) Dan Gardner (BrainML) - XML schema for neuroscience data

There are other folks, but I believe this core of people cut across a  
variety of neuroscientific sub-domains and levels of technical  
complexity.  I'd also recommend someone from the field of 3D digital  
brain atlasing (atlas data set/computer vision algorithm/atlas tool  
development), but as I'm in this field myself, I don't feel it's  
appropriate for me to suggest which of the several researchers would  
be the most appropriate.  I would only say it's important to  
recognize the distinction between spatially-based, neuroscience data  
sets (GENSAT, Allen Brain Atlas, Desmond Smith's "voxelized"  
microarray data sets) and the use of brain atlases to provide a  
canonical coordinate space and algorithmic tool set via which one can  
perform large-scale integration & atlas mapping of spatially-based,  
neuroscience data sets.  This task - integration of spatially-mapped  
neuroscience data sets - is obviously one for which semantic web  
technologies will be a critical catalytic factor.

2) The BioRDF Wiki page:
	I'd suggest this focus on semantic web applications in the  
neuroscience.  There is already a link to a list of projects (e.g.,  
SWAN, Semantic Synapse, NeuroCommons).  Rather than place substantive  
info on these 3 projects 3 clicks away, I'd suggest you list them  
right there on main BioRDF Wiki along with a 1 - 2 sentence summary  
of each project.  This will guarantee the widest possible recognition/ 
visibility for these efforts.
	I'd also suggest that in listing of "other" neuroscience resources  
on the web, rather than creating an ad hoc collection of a few  
projects (which can effect general credibility - e.g., "Where are all  
those neuroscience resources I think are important - why just BrainML  
& GENSAT?" - I'd point to the several consortia and/or  
registries/"yellow pages" already compiled - e.g., the Society for  
Neuroscience's Neuroscience Database Gateway (http://big.sfn.org/NDG/ 
site/), David Kennedy's Internet Analysis Tools Registry (mainly  
neuroscience tools, though this scope is expanding - http:// 
www.cma.mgh.harvard.edu/iatr/display.php?spec=all), fMRI Tools  
(http://www.fmritools.org/), The Neuroinformatics Portal Pilot  
(http://www.neuroinf.de/), etc.

	3) Licensing:
To say one final thing about licensing, I completely agree with Don  
that it is a hideous, unworkable mess.  Go back to the single  
statement in Article 8 of the U.S. Constitution, and you clearly get  
the sense of what was originally intended by establishing copyright  
and patent law as a legal entities (http://www.archives.gov/national- 
archives-experience/charters/constitution_transcript.html):

"The Congress shall have Power...To promote the Progress of Science  
and useful Arts, by securing for limited Times to Authors and  
Inventors the exclusive Right to their respective Writings and  
Discoveries;"

It was recognized even 200 years ago the creative commons is of great  
value to society.  For this value to be realized, these resources  
must be a part of the commons and available to all - including latter  
day inventors, artists, and scientists seeking to build on what came  
before.  This need, however, must be balanced again the desire of the  
artist, scientists, inventor to make a productive living from the  
fruits of their labor (otherwise, the creation stops).

I'd guess most folks on this list would certainly agree with the need  
to establish this right.  Where the founders went wrong was in the  
statement "The Congress shall have Power To...", as this left the  
door wide open for Congress to redefine what copyright was all  
about.  As most of you probably know, the balance began to shift from  
the "...Authors and Inventors (and scientists)..." to publishers  
(those solely in business to make $$$ off the efforts of the creative  
persons) starting in the late 19th Century with the proliferation of  
pirated sheet music.  This trend worsened through the last century,  
but really took a significant, qualitative leap away from the  
original intentions as outlined in Article 8 above with the DMCA.   
Given how significant a driver IP is for the engines of the economy  
(and greed), I'm still uncertain how we can over turn this trend and  
get back to the original principles.  The work sponsored by the  
CreativeCommons - and specifically The ScienceCommons - will  
certainly help to get us there**. This is the case despite the  
extremely clear detriment the current trend has toward society as a  
whole*** and to the communication amongst scientists in particular.

Though still problematic, I actually endorse the use of licensing by  
the NeuroNames folks (as you might have been able to gather already),  
as I see their application going right back to that original  
statement in the U.S. Constitution.  It's one thing to bulk download  
sequence records and "cleanse" their semantic content in order to  
promote powerful knowledge mining efforts.  When it comes to highly  
curated, knowledge resources, the onus is on the user to be careful  
both to clearly understand the original intentions and limitations of  
the resource, as well as to work to protect the integrity of the  
resource.  It does none of us any good to create a "better" or more  
"open" NeuroNames, if that just becomes another version of  
NeuroNames.  If we are not ALL using the same NeuroNames (or at least  
using compatible and consistent versions), then we defeat the purpose  
of using NeuroNames for large-scale data integration and semantic  
mining.

What is needed is for there to be an established authority to  
arbitrate when issues of curation and usage of a knowledge resources  
come into conflict.  Here again, I'd suggest going to NCBO for help.   
Not that they have an infinite supply of resources and can solve all  
the problems, but at least they understand this complex issue from  
both sides - that of the curation authority and of the biomedical  
informatics scientist trying to make productive use of the resource -  
and have some resources and authority to grease the wheels of science  
in this domain.

Again - just my $0.02.  I hope this helps to clarify what I've been  
trying to communicate in this thread.

Cheers,
Bill

** I expect it's a bit superfluous to mention here, but I'd suggest  
checking out the SC info resources, if you've not already at http:// 
sciencecommons.org/resources.

***see the excellent article by Richard Nelson posted by John  
Wilbanks on the Science Commons weblog a few months back [http:// 
sciencecommons.org/weblog/archive/2006/02/15/richard-nelson-on-the- 
scientific-commons] for an excellent treatment of how this directly  
impedes the pursuit and accumulation of scientific knowledge.

On Jun 6, 2006, at 7:42 PM, kc28 wrote:

> Hi Bill,
>
> You really can write faster than I can read :-).  Actually, we have  
> discussed in a previous telconf about how to outreach to  the  
> neuroscience community. I think this represents a good opportunity  
> to try to get people like Doug Bowden involved, as we are  
> interested in converting Neuronames into RDF/OWL. I wonder if it's  
> possible to invite neuroscientists like Doug Bowden and Gordon  
> Shepherd (and possibly more) to talk about their work in our future  
> BioRDF/Ontology telconf. This will foster more interaction between  
> the semantic web community and neuroscience community. I wonder how  
> this sounds to other semantic web folks.
>
> Cheers,
>
> -Kei
>
> William Bug wrote:
>
>>
>> Dear Matthias,
>>
>> I would strongly recommend you contact Doug Bowden and colleagues  
>> at  NeuroNames before you undertake this task - or at least take a  
>> look  at the NeuroNames specifics I list in my previous email.   
>> I'd be glad  to answer any questions you may have about statements  
>> I made.  Doug  and his collaborators are extremely collegial and  
>> make a very sincere  effort to work with those interested in  
>> making effective - or novel -  use of NN.
>>
>> The other person you should contact is Daniel Rubin at NCBO, who,  
>> for  all I know, is lurking on this thread.  Others in the thread  
>> appeared  to be addressing Daniel.  This is a topic actively  
>> under  investigation both by NCBO and by the BIRN.
>>
>> As I mentioned in my post to this thread, Doug & colleagues have  
>> been  working for the last year with Jack Park of SRI to express  
>> NN in XTM  format.  A lot of effort needs to go into vetting this  
>> "remapping" to  make certain none of the assertions in the  
>> hierarchy - explicit or  implicit - are invalidated - as well as  
>> ensuring no new assertions  are unwittingly introduced.  You may  
>> want to work from this version  of NN to create an RDF/OWL  
>> version.  As I mentioned in the previous  post, there has been  
>> some substantive effort to examine the  differences and  
>> similarities between XTM & RDF - and there may even  be  
>> translators or XSL instances that can get you most of the way.
>>
>> Doug also distributes the entirety of NN on CD with all of the  
>> latest  work they've done in the past year to incorporate rat &  
>> mouse  neuroanatomical terminologies - an added dimension  
>> absolutely  critical to those of us interested in collating  
>> microarray, in situ &  IHC expression studies in mouse brain with  
>> neuroimaging data sets and  3D digital brain atlases.
>>
>> There is definitely a need for an open source, RDF/OWL version of   
>> NeuroNames (and the neuroanatomical portion of RadLex for that  
>> matter  - http://www.rsna.org/RadLex/ - if you are interested in  
>> human,  radiological imaging of the brain).
>>
>> I believe we must do our best to work with the curators/developers  
>> on  these various knowledge resource projects, given the  
>> biological  complexity embedded in these resources.
>>
>> As far as the licensing goes, Doug realizes this is a thorny  
>> issue.   The initial license was merely put in place to avoid  
>> others  downloading this highly curated knowledge resource,  
>> modifying it,  then repackaging it as "NeuroNames."  As I  
>> mentioned, this was not a  paranoid fear.  The license was imposed  
>> in response to someone  actually having done this with NN.   
>> Knowledge resources like this -  even when they are just  
>> terminologies - require careful curation, and  uncontrolled  
>> dissemination and modification can ultimately degrade  the  
>> usefulness of the resource.
>>
>> Of course, closed, proprietary licensing can also degrade its   
>> usefulness, so there is a delicate balance that must be struck.
>>
>> This is an issue I believe NCBO can help us all to resolve.  They   
>> won't have all the answers, but may be able to sponsor a means to   
>> derive an effective solution to this problem.
>>
>> My recommendation is a statement be sent by the W3CSW HCLSIG -  
>> maybe  the BioRDF & BIOONT groups collectively - informing Doug of  
>> the need  as they see it.  He will not be surprised by the nature  
>> of your  request, but will be very surprised and pleased to see  
>> this need  emerging from the semantic web community.  I don't  
>> believe he reads  this list.  I know he will be happy to work with  
>> participants on the  W3CSW HCLSIG to get us what we have all  
>> identified as essential - an  open source, unified neuroanatomical  
>> terminological (and in  association with FMA - as Neuro-FMA -  
>> ontological) resource all  formal annotation efforts can make  
>> shared and productive use of.
>>
>> Just my $0.02 on the topic.
>>
>> Cheers,
>> Bill
>>
>> On Jun 6, 2006, at 3:38 PM, Matthias Samwald wrote:
>>
>>>
>>> Hi Kei,
>>>
>>> I am under the impression that the neuronames ontology available  
>>> on  their website (as an Excel file...) is different from the  
>>> version  that is licensed as part of the UMLS. I guess the  
>>> version that is  online is a newer version of the one  
>>> incorporated in UMLS. However,  this might be seen as a  
>>> derivative work, so it might still be  restricted. In that case,  
>>> it would seem like people of the  neuronames group are violating  
>>> the licence restrictions themselves  (by making it available on  
>>> the internet). I will write them and ask  about that.
>>>
>>> kind regards,
>>> Matthias
>>>
>>>
>>>>
>>>>  Hi Matthias,
>>>>
>>>>
>>>>  Thanks for doing that, but do we still have the licensing issue as
>>>>  stated by Olivier?
>>>>
>>>>  Cheers,
>>>>
>>>>
>>>>  -Kei
>>>>
>>>>
>>>>  Matthias Samwald wrote:
>>>>
>>>>
>>>>>  I will convert the neuronames - ontology to SKOS (an OWL ontology
>>>>>  used for the representation of taxonomies / theasauri). It will
>>>>>  be added to the extension of the bio-zen ontologies framework
>>>>>  [1]. I will keep you updated.
>>>>>
>>>>>
>>>>>  kind regards,
>>>>>  Matthias Samwald
>>>>>
>>>>>
>>>>>  [1] http://neuroscientific.net/index.php?id=download
>>>>>
>>>>>
>>>>>  On Mon, 05 Jun 2006 21:17:55 -0400, kc28 wrote:
>>>>>
>>>>>
>>>>>>  For more up-to-date information about neuronames and related
>>>>>>  tools, please visit: http://braininfo.rprc.washington.edu/.
>>>>>>  While building our own open neural anatomy is one option,
>>>>>>  getting the neuroscientist (e.g., braininfo people) involved if
>>>>>>  possible may be another option (outreach to the neuroscience
>>>>>>  community?).
>>>>>
>>>
>>>
>>>
>>>
>>
>> Bill Bug
>> Senior Analyst/Ontological Engineer
>>
>> Laboratory for Bioimaging  & Anatomical Informatics
>> www.neuroterrain.org
>> Department of Neurobiology & Anatomy
>> Drexel University College of Medicine
>> 2900 Queen Lane
>> Philadelphia, PA    19129
>> 215 991 8430 (ph)
>> 610 457 0443 (mobile)
>> 215 843 9367 (fax)
>>
>>
>> Please Note: I now have a new email - William.Bug@DrexelMed.edu
>>
>>
>>
>>
>>
>>
>>
>> This email and any accompany attachments are confidential. This  
>> information is intended solely for the use of the individual to  
>> whom it is addressed. Any review, disclosure, copying,  
>> distribution, or use of this email communication by others is  
>> strictly prohibited. If you are not the intended recipient please  
>> notify us immediately by returning this message to the sender and  
>> delete all copies. Thank you for your cooperation.
>>
>

Bill Bug
Senior Analyst/Ontological Engineer

Laboratory for Bioimaging  & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA    19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - William.Bug@DrexelMed.edu







This email and any accompany attachments are confidential. This information is intended solely for the use of the individual to whom it is addressed. Any review, disclosure, copying, distribution, or use of this email communication by others is strictly prohibited. If you are not the intended recipient please notify us immediately by returning this message to the sender and delete all copies. Thank you for your cooperation.
Received on Wednesday, 7 June 2006 19:34:49 UTC