Spatial queries against GENSAT or ABA

Hi Kei,

You are right on target re: use of a coordinate-based, spatial query  
system to resolve the relatively simple query: "In which brain  
regions is GENE X expressed?"

This is the whole goal of several major neuroinformatics projects  
currently underway which are designed to use either 2D or 3D digital  
brain atlases to make such a query possible.  Several of those  
efforts are associated with the BIRN project.  In fact several such  
projects working on inbred mouse strain atlases have been striving to  
function synergistically within a single system (the Mouse BIRN  
Atlasing Tool or MBAT) specifically to support such a query.  ABA is  
not currently available to query within MBAT, because it's not  
registered to the primary atlas being used in MBAT right now.  This  
work may eventually get done, but it won't be ready for the demo.

The absolute pre-requisites for resolving such a query are:
	1) you must have a set of canonical brain images (2D) or a true  
voxel based canonical brain (3D) - "ATLASES" - that include expert- 
assisted brain region segmentation.
	2) these canonical pixel-based brain images (2D) or voxel based  
images (3D) must be situated within a defined coordinate space.
	3) the segmented brain regions must be deterministically placed  
within the same coordinate space.
	4) the images containing the gene expression patterns must be  
segmented (manually, semi-automatically, or automatically) to provide  
defined geometries for the expression patterns.
	5) the images containing the gene expression patterns must be  
registered to the canonical atlas data and coordinate space (whether  
2D or 3D).

With these conditions met, you could then present a user with a nice  
3D visualization of the atlas (or even just the list of brain region  
IDs or preferred labels) and/or a list of gene names/IDs and let them  
ask both of the following questions:
	a) In which brain regions is GENE X expressed?
	b) Which genes does BRAIN REGION X contain defined expression values  
beyond some baseline?

Right now, GENSAT is not registered to an atlas, so there is no  
coordinate frame to support resolving such as query.  They have  
manually curated many of the gene-specific images with both brain  
regions and cell types, so you can pose that query and get an answer  
based on the curation they have had the resources to do so far, but  
there is no way to place it in a GIS context (2D or 3D), since none  
of their info is YET linked to a canonical coordinate space (several  
projects are working on this very issue).

ABA has aligned to a 2D mouse brain atlas (F&P C57Bl/6 adult brain  
atlas).  In doing so, the 2D brain region segmentations on each of  
the images in the F&P mouse atlas can be super-imposed on the  
registered images from any of the 20,000+ brains.  The problem is the  
current registration has a moderate error associated with it, so that  
answering that query programmatically is problematic and often not  
very informative.  The following can be done:
	- along the coronal sectioning axis, give me the plate numbers for  
all the images in the atlas that contain a slice through the STRIATUM
	- for ABA brain stained for GENE X, give me all the sections that  
have been roughly aligned to that set of F&P atlas images.

 From there the alignment is so coarse at this point, you could only  
use the atlas plates and location of the STRIATUM to help guide a  
qualitative assessment of whether there appears to be any staining in  
the STRIATUM.

In fact, via this route, many contributers to GeneNetwork.org have  
actually linked the probe sets in their microarray QTL database to  
staining patterns in ABA.  In other words, if through there system,  
you uncover via QTL a locus or collection of SNPs associated with  
altered expression of a given gene - say Dopamine Receptor, type D2  
(DRD2) - you might find someone has added an ABA or GENSAT annotation  
for DRD2 using the GeneNetwork.org GeneWiki.
	1) Go to www.genenetwork.org
		http://www.genenetwork.org/search3.html
	2) Enter 'DRD2' in the 'ANY' box searching against the default  
settings for other fields - & hit 'Search'
	3) Click on the single result entry
	4) In the record for DRD2, click on the GeneWiki button near the top  
of the page
	5) This will bring up a listing of all the annotations in  
GeneNetwork for DRD2 including qualitative annotations that someone  
did for the ABA DRD2 brain.

If you want to see ALL of the genes for which ABA or GENSAT GeneWiki  
entries exist, just go back to step '1', enter wiki=ABA or  
wiki=GENSAT respectively in one of the 'ANY' boxes, and hit  
'Search'.  Then pick up at step '3' above.

Were we able to SCRAPE this, then you would have annotation for ABA  
that is roughly equivalent to that which exists for GENSAT - ONLY -  
it probably is doesn't cover the ABA very thoroughly (using the  
generic 'wiki=aba' brings up 948 probe sets - or ~5% of ABA - pretty  
remarkable, actually, given its a manual effort), and these GeneWiki  
annotations are mostly in free-text right now and are not done to a  
controlled vocabulary or classification scheme.  :-(

When the registration to the atlas improves to say the 50 - 100  
micron range, then the flood-gates will open, and all 20,000 brains  
in ABA each staining for a particular gene will be able to  
automatically provide relatively solid answers to  these straight- 
forward questions related to where in the brain is Gene X expressed -  
and which genes does Brain Region Y show marked expression of.  Even  
here, however, there will be continued room for nuance in defining  
the ABA staining patterns - AND - there will be a need to eventually  
to add the time dimension to all these queries (e.g., "When is Gene X  
expressed in Brain Region Y?").

Because the ABA has created multi-resolution versions of their brain  
images (both the Nissl stains for cell bodies and the pseudo-colored  
ISH images for a given gene), it is possible to use the very nice  
Google Maps API GUI Alan created to select a given 1 of the 20,000  
ABA brains and simply Zoom & Pan on the actual pixel image data.   
However, there is no straight-forward way to use it to pose and  
answer SPATIAL queries.

What MIGHT be possible - based on the alignment they have done and  
the information provided in that brain region ontology Excel file  
Alan has - is to say, for the 'DRD2' brain, filter the sagittal image  
series to create a subset including only those images aligned to an  
F&P atlas images which contains a section through Brain Region X (say  
'STRIATUM').  This way, if through some SPARQL query you pulled up a  
relation between DRD2 and STRIATUM, you'd be able to present a user  
with a very nice, low-tech interface to quickly pan&zoom on the  
median section of that 'STRIATUM'-filtered series to look at the  
staining pattern.  You could add a navigation control to go back-n- 
forth through the series for the DRD2 brain, so they could get a  
pretty good sense in 3D where DRD2 expression is in the striatum.   
You might also go to BAMS or CoCoMac (BAMS is better in this instance  
since it's rodent focused - whereas CoCoMac is primate focused) to  
automatically determine what regions connect to (is_afferent_to) and  
what regions are connected to (is_efferent_to) the STRIATUM.  You  
could then bring up another HTML frame that gives you a view of the  
DRD2 subset series for those brain regions, too.

THAT WOULD ACTUALLY BE A VERY NICE INTERFACE - and is probably quite  
tractable for the demo - if this sounds like a useful feature to  
provide.

Running atlas-based SPATIAL queries against GENSAT and ABA is a very  
much sought after goal both for the curators of those repositories  
and for the neuroscience community at large, but we are not there yet.



I'm not certain I understand what you are asking re: highly expressed  
genes that correlate with high levels of ADDL or Abeta.  I could see  
how you might be able to use GENSAT (which has a 'staining intensity'  
annotation field) to ask whether genes associated with high levels of  
specific ADDL species or with plaque deposition are expressed at high  
levels in the GENSAT data set - and if so - where are they expressed  
in the brain - and at what developmental time.  Given the sparse  
nature of the GENSAT data set, this would not be a comprehensive  
answer to the question, but it could prove very interesting.  I'm  
certain June, Gwen, or Elisabeth could help us identify genes whose  
expression correlates with high levels of ADDL species (most  
interesting question given current AD research) or with other APP  
related macromolecules or plaques.  I'm not certain how you'd ask the  
same question of ABA, given there are not systematic annotations on  
staining intensity or pattern - though some of this has been done  
(see below).

Cheers,
Bill
	


On Mar 3, 2007, at 8:01 PM, kc28 wrote:

>
> Alan et al.,
>
> In addition to mapping to brain regions, what seems to be also  
> missing is some kind of brain coordinates. I thought one major  
> advanatage of using Google Map is the ability to issue GIS-like  
> queries. With this type queries, one can potentially query  
> something like finding expressed genes for a given brain region and  
> its neighbouring/adjacent regions.
>
> While we are talking about gene expression, what seems to be also  
> logical to consider is whether some highly expressed genes  
> correlate with high abundance of pathological proteins (e.g.,  
> amyloid beta). Any take from neuroscientists?
>
> -Kei
>
>  Alan Ruttenberg wrote:
>
>>
>> On Mar 2, 2007, at 1:56 PM, Kei Cheung wrote:
>>
>>> By reading the AD/PD use case, one of the questions has to do  
>>> with  what genes are expressed in what regions of the brain (if  
>>> such gene  expressions are localized to certain brain regions). I  
>>> wonder what  Alan's currently working on can help address this  
>>> type of question  (even though the kind of gene expression data  
>>> is for the mouse --  perhaps we can find homologous genes for  
>>> human). Also, I'd  encourage people to take look at what Bill  
>>> Bug's Wiki page:
>>
>>
>> What I can do is add an orthology mapping. Probably from orthogene.
>>
>> I can also scrape the Allen site for the following query they provide
>>
>> Brain Region(see list below), Expression-level(low/ 
>> high),Expression- density(low/high), expression pattern(clustered/ 
>> not clustered). =>  gene set
>>
>> So this would be 16x2x2x2 = 128 different gene sets.
>>
>> There is also their "Fine structure search" :
>> Fine structure annotation lists are genes that have high  
>> specificity  expression in particular brain regions or nuclei.
>>
>> They provide these gene lists for a set of structures listed  
>> below  (fine structures).
>>
>> This can lead us to a particular image, though I don't have a way  
>> yet  to identify which portion of the image corresponds to a  
>> particular  region or structure.
>>
>
>

Bill Bug
Senior Research Analyst/Ontological Engineer

Laboratory for Bioimaging  & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA    19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - William.Bug@DrexelMed.edu

Received on Sunday, 4 March 2007 04:21:32 UTC