Re: NeuroNames [was: slides for the UMLS presentation]

Hi All,

Sorry - I'd thought I'd already subscribed to this list, but  
apparently not - until now.

The need for a mereotopologically-sound, neuroanatomical ontology is  
quite pressing across the community of neuroscientists involved in  
neuroinformatics projects most of which include a neuroimaging  
component.  Generally there is only one thing neuroscientists are  
interested in when analyzing images at whatever resolution from the  
macromolecular (EM) on up to the macroscopic - i.e., identifying  
biologically relevant shapes.  In order for these shapes to have any  
meaning in a context where one attempts to pool data and perform  
relevant data reduction operations, the shapes must exist within a  
shared coordinate space of some sort.  For instance, if two separate  
labs are examining the change in the size of the Substantia Nigra  
during the course of Parkinsonian neurodegeneration, in order for  
them to compare their observations, they require several data  
integration/semantic frameworks:
	- a shared neuroanatomical terminology
	- a shared coordinate space (to place the shapes from their images  
in a comparable coordinate framework)
	- a shared, well-founded anatomical ontology which encapsulates  
mereotopological knowledge about shapes in - at least - 3D space.
Other knowledge resources can be helpful in supplementing this array  
of tools, but, generally, these are the absolute minimum.

[NOTE: the Wikipedia has a moderately clear definition of  
mereotopology (http://en.wikipedia.org/wiki/Mereotopology).   
Basically, it combines a formal, ontological theory of shapes and  
boundaries (mereology) with the mathematics of topology with the goal  
of providing a computational formalism to support applying logical  
operations to objects in space.  As has been pointed out by others, a  
great deal of the work in this field of applied biomedical  
mereotopology derives from related work in the GIS field.  Use of  
mereotopology by geographers has been going on for quite some time  
and is much more advanced.  Work from GIS can be adapted for use in  
the biomedical domain, but it must be done with great care, as many  
of the assumptions behind the way researchers represent space and  
manner of information being represented can differ significantly  
across these disciplines.]

The same is true as you scale this problem up to field-wide projects  
such as BIRN or The NeuroCommons.

As several have mentioned in this thread, there are already existing  
resources that can begin to fill this need.

1) NeuroNames
Kei, Olivier, Peter Mork, and others have already given sufficient  
references on NeuroNames in this thread, so that others can dig in  
deeper to the specifics if they like.

Having worked with Doug Bowden, Mark Dubach, and their colleagues  
over the last year or so in an advisory capacity on the specific  
issue of use of NeuroNames for semantically-based, neuroanatomical  
data set integration, I can add a few important qualifying points:
	a) Doug et al. have been working on the extremely difficult task of  
unifying neuroanatomical terminologies across mammalian species for  
20 years now.  Embedded in Neuronames & Braininfo, there is a wealth  
of hard won empirical knowledge related to how one achieves this  
end.  I think it would be ill-advised to try to duplicate their  
effort, as the myriad scientific problems related to this effort  
would surely present themselves again and only need to be worked out  
once one.
	b) Doug et al. are extremely collegial and quite receptive to  
feedback and collaboration - within the bounds of their limited  
resources.
	c) NeuroNames is a terminological resource - not a well-founded,  
spatial ontology of brain anatomy capable of supporting  
mereotopological reasoning.  As with most research-based  
terminologies, there are many semantically-based relations embedded  
in the NeuroNames graphs, but as the primary goal of NN is to  
disambiguate and integrate across the neuroanatomical lexicon, the  
embedded semantic information can often lead to a logical dead end.   
For instance, many neuroanatomical terms critical to specifying  
location in the rodent brain have been placed in the NN category  
"ancillary terms," as they don't fit into the core hierarchy in an  
unambiguous way.  This can make use of NN for annotating mouse brain  
gene & protein expression patterns (e.g., GENSAT, the Allen Brain  
Atlas, various BIRN projects) extremely problematic.
	d) The NN primary structures (http://braininfo.rprc.washington.edu/ 
indexabout.html) provide the closest thing to an ontology in NN.  As  
Peter Mork pointed out, there has been an effort in the past to unite  
this core NN hierarchy with the FMA, which does provide a  
mereotopologically sound framework for anatomy.  Barry Smith (formal  
ontologist who has worked for over a decade on problems in biomedical  
ontology - most especially, though hardly exclusively, in the area of  
mereotopological reasoning) and his colleagues have worked closely  
with the Cornelius Rosse and his colleagues at the FMA project to  
create in association with the work started in the FMA a foundational  
ontology for biomedicine (the Ontology of Biological Reality) that is  
becoming increasingly important to all of the ontologies being  
monitored by NCBO and incorporated into the OBO site and the emerging  
OBO Foundary (http://obofoundry.org/).
	e) Doug and his colleagues have worked closely with Jack Park (a  
consulting scientist to SRI's AI Center - http://www.ai.sri.com/) to  
represent NN as a TopicMap (XTM).  As many on this list may know,  
there has been a moderate amount of effort to integrate and/or  
reconcile XTM with RDF here at the W3C (search on "TopicMaps" at the  
main RDF page - http://www.w3.org/RDF/).  I'm not certain how this  
effort will ultimately make NN more "semantic web" compliant, but the  
bottom line is a great deal of effort has already been expended to  
express NN in a semantically well-grounded formalism.
	f) Though - as Don points out - neuroanatomical representations are  
likely to significantly evolve over the coming decades, as the number  
of large scale gene & protein expression characterization studies  
focussed on the brain continue to accumulate.  Having said that, the  
"conventional" view of neuroanatomy will likely remain relevant for a  
long while to come, not only because it has been used to characterize  
findings in the literature for the last 125+ years, but also because  
it did derive from a wealth of empirical observation which is likely  
to remain valid in many domains of neuroanatomical study.  I would  
also modify Don's well informed comment regarding the derivation of  
"conventional" views of neuroanatomy.  To a large extent they are  
related to functional studies of the brain - as well as lesion based  
studies of functional deficits dating back to the 19th century (think  
"Broca's Area"), but they are also very much based on a study of the  
morphology of the brain - both the external surface morphology  
(sulci, gyri, and lobes), as well as histological examination of  
internal structures.  Many of these studies of structure in space are  
likely to stay with us for some time to come (and are well-founded in  
reality), though as Tim Clark & Don have pointed out in this thread,  
nomenclature is still a very significant problem even in this very  
"old" field.
	g) licensing of NN - Doug et al. formerly had a completely open  
policy to distributing NN.  The only a reason a license was  
instituted was at some point about 5 years back another group sucked  
down the entirety of NN, reworked a lot of what was there - probably  
with very practical goals directed toward making NN more "correct"  
and effective in their problem domain - then "republished" their  
product as "NeuroNames".  This lead to a great deal of confusion.   
The fact they chose to do this on sly also meant the work they did  
was not necessarily compatible with the work done by Doug et al..  In  
order to avoid this happening again, it was decided a license would  
be established to discourage this sort of behavior.  As anyone who  
has developed a terminology and/or ontology, it is absolutely  
essential there remain a single curating authority, if the value of  
the resource is to remain in tact.  The "vetting" performed by the  
central authority - as is extensively done by the curators of the  
Gene Ontology, for instance - is absolutely essential to the  
guaranteeing the integrity of the knowledge resource.  This is not a  
"closed" or proprietary process, just a highly controlled one.   
Unfortunately, Doug Bowden's resources are MUCH MUCH smaller than  
those available to the curators/developers of GO, so the NN curation  
effort necessarily moves at a slower pace.

2) Working with the Neuroscience community
As Kei, Don, and others have stated, it would be unwise to proceed in  
creating an "open source" neuroanatomical ontology without  
interacting with the researchers who've already put a lot of effort  
into this problem over the past decade or so.  With this in mind, I  
have several suggestions:
	a) The 5 ways of knowing neuroanatomy:
		This is a pitch I've been making which I think helps to sum up the  
current ways various sub-fields have attempted to identify/label/ 
collate brain morphology
		i) Terminlogies - e.g., NN, BrainLex
		ii) Ontologies - e.g., Neuro-FMA (the project Peter Mork referred to)
		iii) Literature Informatics (CocoMac, BrainMap, NeuroScholar, BAMS,  
ArrowSmith, etc.).
			These are very mature projects.  Some include their own  
mereotopological reasoning systems (e.g., CocoMac and BrainMap) in  
order to be able to pool and compare the relatedness of structures  
and connectivity across different studies in the literature.  The  
goal in this category is to perform large-scale semantic mining of  
the literature to confirm/refute current knowledge and uncover new  
correlations - very much along the lines of what The NeuroCommons  
Project expects to achieve via use of semantic web technologies.   
Some researchers in this category are actually participating in The  
NeuroCommons Project (i.e., Gully Burns, who developed NeuroScholar).
		iv) voxel/pixel analysis:
			This approach applies computer vision algorithms to automatically  
- or semi-automatically - identify 2D & 3D shapes in digital  
anatomical images.  This field is also extremely mature, though there  
are many significant caveats to exactly how much of this work can be  
effectively automated.
		v) parameterized models:
			Often these are derived from - or used to drive - the voxel/pixel  
based analysis described in 'iv' - though the spatial modeling is  
definitely a distinct approach from the pure voxel/pixel approach.

None of studies you'd fit into these categories exclusively focus on  
their technique/tool alone without some aspect of the other "ways of  
knowing neuroanatomy" playing a role in what they do.  However, it is  
clear much fundamental work in this area primarily focuses on one  
technique over the others.

Having said that, when the neuroscience community makes use of this  
work to examine a specific biological problem, they will often draw  
significant tools and resources from more than one of these domains.

	b) NCBO/NCOR sponsored meeting focused on mereotopology in  
neuroanatomy:
		Barry Smith is working to bring together researchers working in the  
5 domains described above.  There is a very pressing need in large- 
scale, field-wide neuroinformatics projects such as what is being  
done in the BIRN project to have these 5 domains converge and work  
more cooperatively.  Right now, a lot of manual effort has to be put  
out to bring them together.  This is something BIRN has been  
pursuing.  In the last 6 months, we have received a great deal of  
support and guidance on this effort from NCBO.  Daniel Rubin  
interacts directly with the BIRN Ontology Task Force, and the work  
Barry Smith has been doing with FMA, OBO, FuGO, and PATO have very  
much begun to create a much more well-founded and computable path  
toward performing large-scale annotation of neuroimaging data.
		This meeting is on the NCBO/NCOR slate for 2007, but in the interim  
I hope to see more effort invested in the coming year across the 5  
communities listed above  toward the goal of integrating across these  
"ways of knowing" now that the need has been recognized.
			
3) Microarrays:
	Just as Don, Kei, Alan R., and others have pointed out, high- 
throughput assays - microarrays, BAC-based IHC, in situ studies using  
the Gene Paint technology employed by the Allen Institute of Brain  
Science to construct the Allen Brain Atlas of gene expression in the  
brain - are going to transform our understanding of neuroanatomy over  
the coming decades.  This is just a given.  There is a pressing need  
to derive a means to integrate spatially-mapped studies of gene &  
protein expression into a neuroimaging setting. The spatial  
resolution may be very coarse - e.g., "whole brain" - but they still  
provide sufficient spatial information to be usable in the context of  
a neuroanatomical coordinate system.
	We are working in the BIRN project to create a means for researchers  
to integrate these distinct approaches to studying the brain.  As  
Alan R. pointed out, FuGO is working to put description of microarray  
experiments on a solid, formal footing, and I would expect one aspect  
of that will be to represent microarray data in RDF/OWL.  This is not  
a trivial problem, given as much of the available data is merely  
MIAME-compliant - MIAME not even being a data format, but just a  
collection of minimal data requirements.  One need only look at the  
great complexity of the data submission process at the NCBI GEO site  
to get an appreciation for how difficult this problem can be.  A  
great deal of effort is being invested in the microarray field to  
come up with a better means handle this issue, and the FuGO effort  
will be a critical clearinghouse for this work.  The important thing  
to remember when it comes to field-wide data pooling and re-analysis,  
it may sometimes be necessary to get right back to the microarray  
primary image files so as to reapply different criterial  when  
performing the statistical tests and reductions on pooled data.   
Given this requirement - one we also see in the neuroimaging domain -  
I believe it is very important to proceed in a well-reasoned manner  
when seeking to integrate across microarray datasets using semantic  
web technologies.  Alan R. and myself - possibly others too - on this  
list are on the FuGO Coordinators Committee, so hopefully we can help  
to keep those lines of communication open.

Sorry to go on so, but this is a topic on which I've labored quite  
intensively over the past year.  There is a lot being done on this  
issue, and I think all efforts will get much further more quickly -  
and in a way that will carry more street cred with practicing  
neuroscientists - if we all try to work together.

Cheers,
Bill

Bill Bug
Senior Analyst/Ontological Engineer

Laboratory for Bioimaging  & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA    19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - William.Bug@DrexelMed.edu







This email and any accompany attachments are confidential. This information is intended solely for the use of the individual to whom it is addressed. Any review, disclosure, copying, distribution, or use of this email communication by others is strictly prohibited. If you are not the intended recipient please notify us immediately by returning this message to the sender and delete all copies. Thank you for your cooperation.

Received on Tuesday, 6 June 2006 14:41:49 UTC