W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > March 2007

Meaning of ABA brain region spreadsheet columns

From: William Bug <William.Bug@DrexelMed.edu>
Date: Sat, 3 Mar 2007 14:10:46 -0500
Message-Id: <F4D53555-DB44-4631-8D89-C6F78A22EFC2@DrexelMed.edu>
Cc: Nigam Shah <nigam@stanford.edu>, kc28 Cheung <kei.cheung@yale.edu>, June Kinoshita <junekino@media.mit.edu>, Gwen Wong <wonglabow@verizon.net>, Donald Doherty <donald.doherty@brainstage.com>, Mihail Bota <mbota@usc.edu>, MaryAnn Martone <maryann@ncmir.ucsd.edu>, Luis Marenco <luis.marenco@yale.edu>, W3C HCLSIG hcls <public-semweb-lifesci@w3.org>
To: Alan Ruttenberg <alanruttenberg@gmail.com>
Sorry, Alan - got swamped with BIRN Ontology & Mouse BIRN AHM mtg  
preparations for next week.

You are right - you & I reviewed details related to ABA image meta  
data last weekend - NOT brain region level meta data.

I'd bet a lot of what Nigam lays out below  - RGB LUT values and PK -  
is correct.


Region Abbrev (Cols B & C):
	'CNU' can be found in the Swanson (2004) XML file ("Cerebral  
nuclei").  Despite the ABA atlas gray scale image plates having been  
derived from the Franklin & Paxinos adult C57Bl/6 atlas, as Mihail  
mentioned, the Swanson lab did some region classification for ABA,  
and they lumped this at a higher level in a region they called  
"Cerebral Nuclei" - though, again as Mihail points out, Striatum  
immediately maps in rodent to a structure referred to as the  
"Caudoputamen" (http://brancusi.usc.edu/bkms/brain/show-braing2.php? 
aidi=129).  Caudoputamen [very important for the PD use case] and  
other structures in the base of the telencephalon are a part_of a  
larger structure they've defined for ABA as "Cerebral Nuclei (CNU)
	The one fly-in-the-ointment here is the phrase "...they've defined  
for ABA...".  This doesn't necessarily map to any of the other brain  
region classification schemes/CVs used elsewhere - not a trivial  
process - but not impossible - both the NN group and BIRN groups are  
working on this - as is Mihail - as he mentioned).  As Mihail points  
out - and you can see in the XML files he distributes - it does link  
into the vocabulary used in Swanson 1998/2004 for rat.  It most  
likely does not deterministically map to the CVs used for brain  
region by GENSAT (believe that somehow derived from some combination  
of NN and regions as given in the Franklin&Paxinos mouse atlas -  
believe the GENSAT segmentation was performed by someone from George  
Paxinos's lab who went to work with the Rockefeller GENSAT group).   
SenseLab has much less brain region detail.  It MAY be using the  
Swanson nomenclature.  Given SenseLab has just a subset of the  
regions you'd find in an atlas, it's possible someone there at Yale  
could fairly quickly provide a lookup table mapping their region  
terms to one of the other atlases (Luis may even have done this  
already in the context of some of the neuroinformatics repository  
integration work he has done over the last several years).


Region color (Cols D, E, F):
	All digital atlas have a color LUT for regions.  These are generally  
just 8-bit (only because few atlas projects have foreseen having the  
expert personnel resources to manually segment > 256 regions) In the  
atlases, the regions derive from very laborious manual segmentation  
done in tools like AMIRA by specially trained, highly knowledgeable  
neuroanatomists.  The manual segmentation is performed on 2D  
sections, assembled into 3D volumes, smoothed, then added to the 3D  
atlas voxel data (many atlases are not actually TRUE 3D data sets -  
e.g., the Paxinos atlas used at ABA - so the re-assembly, smoothing  
and integration with voxels isn't required in that case).
	Anyway - ABA is obviously being forward looking and using 24-bit  
values for their region LUT.  Besides, when using Paxinos, they are  
GIVEN the region segmentation, so the manual effort is potentially  
eliminated (though the only electronic version of the region  
segmentation typically must be obtained through the atlas publisher -  
Elsevier in this case - and generally all the regions for a given  
Paxinos image (sagittal or coronal) are just lumped into a single,  
bit-mapped file.  This means you must take that bitmap, run  
algorithms to identify the individual regions (usually based on color  
- e.g., just as you see here, each region has a specified color in  
the bit map).  The isolated regions can then be converted to a  
geometric object format (from simple point list on to quad-tree or  
oct-tree) and this is then stored separately in a RDBMS.  This is  
EXACTLY what the SMART Atlas project in BIRN (from Maryann Martone's  
NCMIR group at UCSD - source of CCDB, too) has done.  This way, each  
individual region is defined as a geometry in a specified coordinate  
space AND - most importantly - can then be used to support SPATIAL  
queries on the atlas (e.g., "Show me all the defined brain regions  
that lie within this shape I just drew on image X that you've  
registered to your atlas coordinate space).


Other Region numbers:

	A)  Cols G & H
		I would guess these are BOTH PKs of some sort as they both contain  
rather small and unique integers.  Given column G is listed in order,  
I'd guess that is the ABA internal PK for that region.  The other ID  
is probably a cross-reference to another brain region classification  
scheme.  A search of the various atlas classifications on the  
Mihail's BAMS site doesn't appear to provide such equivalent IDs.   
I've searched in NN, but those IDs don't correspond either (e.g., the  
Col H. for "Thalamus" - 351 - does not correspond to the NN ID for  
"Thalamus" - 283).
	One interesting note - if you sort the spreadsheet on Col H - you  
will find the rows are ordered in nearly perfect alphabetical order  
by brain region abbreviation.  This indicates to me these Col H  
values are likely to relate to IDs created for these regions by  
Mihail/Swanson when they did this classification work for ABA.  There  
are no integer PKs given in the BAMS XML files from the Swanson lab  
that match these numbers, so only Mihail can vet this hunch.  I'd  
guess they are expecting to use the "brain region abbreviation" as  
their immutable, unique link.
	
	B) Col I:
	Typically, the whole purpose of registering brain image stacks into  
the coordinate space of a digital atlas (such as has been done for  
the 20,000+ ABA image stacks for the individual, gene-specific,  
GenePaint-ed brains) is so the expert segmented regions from the  
digital atlas can be used to drive QUANTITATIVE ANALYSIS of the  
registered image stacks in a consistent and comparable manner.  Being  
able to visualize the atlas regions overlayed onto individual images  
from a given registered stack is useful for making qualitative  
observations - or as a pedagogical aid - BUT it can't drive automated  
analysis unless:
		- the atlas comes with a coordinate system and the segmented brain  
regions have been deterministically mapped into that space
		- the image stacks have been registered to the same coordinate space.
	In the case of the ABA, the atlas is the Franklin&Paxinos 2001 adult  
C57Bl/6 brain atlas and the coordinate space is their interpretation  
of stereotaxic coordinate mapping.  The registration process has some  
error (for ABA, I believe that is 300 microns [probably different for  
in-plane registration - i.e., 2D to 2D alignment - vs. the third  
dimension between images in a given dissection axis (i.e., coronal or  
saggital for ABA)]).
	SO - for Cols J,K,L, they likely refer to the location of the brain  
regions in the coordinate space.  In a true 3D atlas, all you'd have  
to do is give 3D geometric definition of the region shape (e.g., as  
an oct-tree), and give the location of the centroid for that shape in  
the coordinate system.  Since the F&P atlas is NOT a true 3D atlas  
but rather a series of 2D images, you don't have a 3D geometric  
definition of the region.  With that in mind, the way to specify  
WHERE in the atlas a given region lies is to give the FIRST and LAST  
atlas image plate the extents of that region lie in when viewed along  
a specific dissection axis.  For very convoluted structures such as  
the hippocampal formation, this can be a bit problematic to use  
computationally, but it's usually sufficient for the types of tasks a  
2D atlas can support.
	In terms of what this set up can support, it is likely one of the  
things they are looking to do is to support users segmenting the gene- 
specific images (GenePaint-ed images) then comparing those gene  
specific segmentations to the brain segmentations.  Given the limits  
of the 2D realm, you can't really do true 3D volumetric  
intersection.  However, what you can do is determine to what extent  
regions-of-interest (ROI) created by users to identify expression  
patterns on a given GenePainted image overlap with the 2D sections  
through the brain regions that appear on the corresponding,  
registered Paxinos plate.  You would essentially try to estimate an  
intersection of the user drawn 2D ROIs with the 2D atlas region  
shapes for each region that appears on that Paxinos plate.  When  
sorting those numbers by atlas brain region, you'd then go through  
ALL the atlas images that contain a given region, and add up the  
total intersection with the user drawn ROIs across all the images  
from BRAIN X that user drew ROIs on.  This would be normalized to the  
total approx. pixel volume for that brain region across all the atlas  
plates where it appears - resulting in an APPROXIMATE volume ratio of  
a given gene stained for in BRAIN X with a given region defined in  
the atlas.  The large number - Col I - appears to scale with the  
approximate size of the region - e.g., "Olfactory areas" is quite  
large (838206), whereas one of the smaller regions included within  
OLF is much smaller (Nucleus of the lateral olfactory tract [NLOT] -  
7407) - so I would guess this column represents that altas-defined  
approx. region volume (i.e., sum of all the 2D areas defined for that  
brain region across all the F&P atlas images).
	
	B) Cols J, K, L:
		It's quite likely ABA calculated an approx. 3D location value for a  
region probably truncated based on the existing locations of the  
Paxinos plates within the stereotaxic coordinate space.  Those  
coordinates would be specified either with:
			* a 2D coordinate location within a F&P atlas image plate (either  
as unitless PIXELS or as stereotaxically-defined MICRONS) + a unique  
ID for that F&P atlas image plate derived from a specific dissection  
plane axis (e.g., F&P Coronal plate 23)
			* a 3D coordinate location that somehow places some morphological  
property of the region in the stereotaxic coordinate space - e.g.  
front-upper-left point for the approx. 3D bounding box for that  
region, centroid for the approx. 3D bounding box for that region, etc.

That's all I have time for now.  Must get back to meeting prep.  It's  
possible reading the ABA Nature paper from January would get you a  
more specific answer - or - better yet - just drop an email to the  
guy you spoke with at ABA.

Hope that helps.

Cheers,
Bill
	

On Mar 2, 2007, at 5:15 PM, Alan Ruttenberg wrote:

> Don't recall doing this, though it's certainly possible that I've  
> forgotten.
> Just to clarify, each of these lines is for a brain region, not for  
> an image.
> If you want to do this later this evening with me, give me a call  
> at home after about 10.
> -Alan
>
> On Mar 2, 2007, at 5:10 PM, William Bug wrote:
>
>> Alan,
>>
>> Didn't you and I review this already at the ABA site.
>>
>> All one would need to do is bring up one of these images at the  
>> ABA site, go through the "noodling" we did, and look at the  
>> corresponding entries in the spreadsheet to match up a "meaning"  
>> to each column (probably nearly all those columns).
>>
>> Cheers,
>> Bill
>>
>> On Mar 2, 2007, at 2:15 PM, Nigam Shah wrote:
>>
>>>> BTW, if someone has a theory of what the other number in
>>>> ontology.xls are, I'm all ears.
>>>
>>> Okay, pure guesses:
>>>
>>> Line 4 = Cerebral
>>> cortex,CTX,CH,176,255,184,3, 85,4141526,61.647,29.999,33.711
>>>
>>> 176,255,184 seem like RGB values (they all range from 2 to 255) for
>>> that region in the image.
>>> 3 is a serial number or internal id.
>>> 85 - no clue
>>> 4141526 - no clue
>>> 61.647,29.999,33.711 seem like 3D voxel coordinates.
>>>
>>> --Nigam.
>>>
>>
>> Bill Bug
>> Senior Research Analyst/Ontological Engineer
>>
>> Laboratory for Bioimaging  & Anatomical Informatics
>> www.neuroterrain.org
>> Department of Neurobiology & Anatomy
>> Drexel University College of Medicine
>> 2900 Queen Lane
>> Philadelphia, PA    19129
>> 215 991 8430 (ph)
>> 610 457 0443 (mobile)
>> 215 843 9367 (fax)
>>
>>
>> Please Note: I now have a new email - William.Bug@DrexelMed.edu
>>
>>
>>
>>
>

Bill Bug
Senior Research Analyst/Ontological Engineer

Laboratory for Bioimaging  & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA    19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - William.Bug@DrexelMed.edu
Received on Saturday, 3 March 2007 19:11:13 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:52:30 UTC