Re: BioRDF [Telcon]: slides for the UMLS presentation from William Bug on 2006-06-09 (public-semweb-lifesci@w3.org from June 2006)

From: William Bug <William.Bug@DrexelMed.edu>
Date: Fri, 9 Jun 2006 17:16:21 -0400
To: Marco Brandizi <brandizi@ebi.ac.uk>
Cc: public-semweb-lifesci@w3.org
Message-Id: <D9B0FA0B-3BD1-4030-97BD-D31A863D1735@DrexelMed.edu>
I think Marco makes several very important points - some of which  
relate to the follow-up comments Michael has posted in reponse.

I though it might be worth highlighting them again - and bringing  
them up for discussion in the call Michael will be giving next week
On Jun 9, 2006, at 11:39 AM, Marco Brandizi wrote:

>
> Hi all,
>
> some notes on the discussion of MAGE/FUGE/FUGO and RDF.
>
> - I think that converting the whole MAGE or FUGE into RDF is hard  
> in practice and maybe not so useful. The problems are:
>
>   - FUGE and MAGE are object models and should be reviewd in order  
> to provide an OWL modelling. For instance the use of OntologyEntry  
> as a pointer to an ontology term doesn't make much sense in OWL.

Exactly - yes yes yes.  It is essentially pointless to have pointers  
to ontologies in OWL.  You have URI pointers to specific ontology  
branch nodes filling specific slots in a given OWL representation of  
some specific type of data.

>
>   - FUGE/MAGE are used to represent huge quantities of data (an  
> average MAGE-ML file is sized some hundreds MBs) and I am not sure  
> that current technologies would support such requirement.

As Michael points out, there is a lot of redundant data expressed in  
an FUGE/MAGE instance.  Most of this should not be directly  
encapsulated in an RDF/OWL representation of a MAGE-ML instance, but  
should rather be given as URI pointers to specific ontology nodes  
(see comment above).  This should be all the description required in  
an INSTANCE of a particular chip result.  Having said that, as I  
mentioned in my post to the SciPub Task Force thread a little while  
ago, it will be critical where the goal is:

"..to perform large-scale, data integration and meta-analysis on data  
derived from disparate studies (as BLAST and HMM gene finding  
algorithms can do with genomic sequence data).  One will often have  
to go right back to the primary data - and have complete, formal  
descriptions of data acquisition provenance and all the processing  
done on the data prior to any significant reduction/analysis.  This  
is certainly true both for neuroimaging data sets derived from all  
imaging modalities used in neuroscience, as well as microarray data  
(as we find ourselves dealing with in the BIRN project)."

>
> - Maybe only some aspects of a Functional Genomics models are  
> really needed in the context of Semantic Web. For instance telling  
> in RDF that an experiment has been performed to study a given  
> disease would be useful, telling to the whole web the concentration  
> value of the application of an extraction protocol maybe is more  
> implementation specific.

If the goal of the semantic web were to serve end users only, I would  
agree Marco.  Since the data we express via SW technologies will also  
be used for "large-scale, data integration and meta-analysis on data  
derived from disparate studies," and since to perform such studies,  
these repositories need to be "open," I think there will be a need to  
have this level of detail available on the web in general.  Having  
said that, it makes sense to want to have specific XSLT or Fresnel  
(http://www.w3.org/2005/04/fresnel-info/) views of the data available  
for different uses - one being effective presentation to end users  
interested in just finding out high-level info such as "this  
particular collection of experiments derived from many different  
studies are related to hepatic cancer and COX-2 and/or ROBO1 gene  
expression."

>
>
>
>
> - I am modelling something about microarrays, although my intent is  
> not to convert MAGE and to face its degree of details. I am more  
> interested in a less detailed knowledge representation about  
> Microarrays, and in the management of the knowledge that is  
> achieved from the study of Gene Expression.
>
> Here an introduction about that:
> http://gca.btbs.unimib.it/brandizi/mysite/phdintro
>
> My latest version of the ontology (very draft actually), plus some  
> notes about the user interface I am developing:
>
>   http://gca.btbs.unimib.it/brandizi/mysite/phdv1

Many thanks for the reference, Marco.

>
>
> Cheers.
>
> -- 
>
> ====================================================================== 
> =========
> Marco Brandizi <brandizi@ebi.ac.uk>
> http://gca.btbs.unimib.it/brandizi
>
>

Bill Bug
Senior Analyst/Ontological Engineer

Laboratory for Bioimaging  & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA    19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - William.Bug@DrexelMed.edu







This email and any accompany attachments are confidential. This information is intended solely for the use of the individual to whom it is addressed. Any review, disclosure, copying, distribution, or use of this email communication by others is strictly prohibited. If you are not the intended recipient please notify us immediately by returning this message to the sender and delete all copies. Thank you for your cooperation.
Received on Friday, 9 June 2006 21:16:38 UTC