TCGA / Microscopy Imaging Use Case

 Studying the morphology of disease at the cellular and sub-cellular levels
using high resolution tissue images is extremely important to help
understand the nature of various cancers. The Cancer Genome Atlas (TCGA) ( contains over 32,000 de-identified
whole-slide microscopy images (WSI) of over two dozen cancer types. These
images can contain between 100K-1M nuclei each.  Biomedical informatics
researcher have developed (and continue to develop) software to
automatically segment nuclei for study.  The spatial features of each
nucleus and groups of nuclei as it relates to other nuclei combined with
other linked data such as other morphological features (crypts, ducts, etc)
and/or patient lab results are used in analyzing and categorizing tissues
and patients into groups and in comparing such groupings to understand
disease mechanisms in a particular cancer type as well as across cancer

Representing nuclear segmentations is often done with binary masks or
through polygon representations (e.g., the use of Well Known Text (WKT)
representations) and also by leveraging work from the Geospatial
community.  However, in the case of nuclear segmentations, coordinate
systems are 2D & 3D Cartesian based.  Although the majority of work is this
area is 2D-based, a growing segment of microscopy is also 3D-based as the
technology develops and become more sophisticated.  As living tissue can
change over time through growth, infection, cancer, damage, etc, (as well
as its associated organism’s various properties) it is important that
spatial locations of features such as nuclear segmentation be also
represented in a temporal aspect for proper comparisons.

Samples of TCGA WSI data can be viewed at:

Erich Bremer, M.Sc.
Director for Cyberinfrastructure
Health Sciences Division of Applied Informatics
Stony Brook Medicine
Tel. : 1-631-444-3560
Fax  : 1-631-444-8873
Cell : 1-631-681-6228
Office Location/Mailing Address
HSC, L3: Room 119
Stony Brook, NY 11794-8330

Received on Tuesday, 3 March 2015 20:52:45 UTC