- From: Marco Brandizi <brandizi@ebi.ac.uk>
- Date: Sun, 10 Sep 2006 13:21:29 +0100
- CC: public-semweb-lifesci@w3.org
Hi all, First, thank you all very much for all the interesting replies. I am not sure I am understanding all of them, anyway... > What you are describing is described in MAGE-OM/MAGE-ML, as a UML > model to capture the real world aspects of running a microarray > experiment. > > Typically at the end of this process a set of genes is identified as > being interesting for some reason and one wants to know more about > this set of genes beyond the microarray experiment that has been > performed. > > I might be wrong but I think that is where Marco is starting, at the > end of the experiment for follow-up.> > Yes, I am trying to represent the experimental activity linked to microarray experiments, taking into account why the experiment are performed, what hypothesis or conclusions may be derived from data analysis and alike. So I would like to represent the gene expression domain in a rather abstract fashion. I know there are similar projects I am interested in (http://lists.w3.org/Archives/Public/public-semweb-lifesci/2006Aug/0133) > So. with reference to this ontology (generated by Marco, or imported > from some standard) he could simply state: > > Individual(c1 type(Computation) value(geneComputedAsExpressed g1) > value(geneComputedAsExpressed g2) value(geneComputedAsExpressed g3) ) > Theese were interesting examples. It seems that I should have some individual somehow (c1), but avoiding the replication of relations between sets. > location and intensity of spots). The tendency when presenting these > results in research articles - and often when sharing the data - is > to provide the analyzed/reduced view of the data. In the context of > these complex experiments, many forms of re-analysis will not be > possible without access to the originally collected data. Think of Nonetheless I think that is useful to perform an operation like: "import these list of few genes and assign a name, description, why I am importing it, more formal knowledge about it... etc.". The meaning of such lists could be something like: "after weeks of microarray analysis, BLASting, etc, this genes are interesting for me/our project/our reserach group". So the reduced lists of microarray items could be useful, and representing it and its meaning with semantic web could be useful as well, notwithstanding the fact you would still need the whole data set to redo more mathematical analysis. I beelive Semantic Web is interesting in the first case, maybe less interesting in the second one. Again on this topic: > The following tools, for example, are available for microarray gene > annotation. > > SOURCE -- http://nar.oxfordjournals.org/cgi/content/full/31/1/219 > KARMA -- > http://nar.oxfordjournals.org/cgi/content/full/32/suppl_2/W441 > RESOURCERER -- http://pga.tigr.org/tigr-scripts/magic/r1.pl DRAGON -- > http://pevsnerlab.kennedykrieger.org/dragon.htm > > These tools take a gene list of interest and return annotation > collected from multiple sources (e.g., gene ontology, UniProt, and > KEGG). It might be useful if these tools can be made > semantic-web-aware. Thanks to Key for reporting these tools, which I didn't know and I'll give a look at. However, what these tools seem to address is official, stable, annotations, available to a wide number of people, with no chance for the user to change or enrich them. Of course such a kind of tools are important, but I am thinking to something that could be used in your research group, where one could make claims, rather than definitive assertions, like: "these genes are involved in this disease", or: "these data confirm this pathway, but I am not completely sure, more validation necessary"... > MAGE-OM/MAGE-ML is also the result of a huge amount of deliberation > from dozens of experts in the informatics fields involved in > generating, storing, and manipulating microarray data. > > When it comes to manipulating the information associated with a > microarray experiment - or collection of experiments - in a > semantically explicitly manner, however, RDF is really the preferred > formalism providing the required explicit semantics, while still > providing the expressiveness needed to characterize the inherent > variety, complexity, and granularity in this information. When it This has already been addressed here (http://lists.w3.org/Archives/Public/public-semweb-lifesci/2006Jun/0098) I wonder wether the level of details that MAGE-OM is able to handle may efficiently be translated into RDF, worse into OWL. Beside, I wonder if this could be something interesting to do. Basically: - MAGE-OM is good enough for the representation of how a microarray experiment has been done (the experimental design) and which raw or normalize data it has produced. It has some limits due to the the fact it is an object model, but it may be coupled with MGED ontology or FUGO to face with that. - MAGE-OM doesn't cover much the follow-up of experimentation, nor higher levels of abstraction, i.e.: hypothesis and conclusions, who is studying some disease, or genes behaviour in a given cell type, etc. moreover... > bringing in information and data from a vast number of resources and > tying it together into big pictures, all without the semantic web. > I'm sure they would love to have the kind of power envisioned by the > W3C for the semantic web but they won't touch it until it is > easy--they are busy doing their core jobs. - ...how much are RDF and OWL scalable? Let's take a small data set of 100 microarray experiments, with 10k probe sets x 10 hybridiazations. We would have (at least) 10 millions numbers to handle, plus several annotations, plus inference, etc. An RDF backend that directly maps SQL to RDF should still work out, but what about an in-memory OWL reasoner? And what about integrating larger amounts of microarray data, crawled from different sources on the web (which should be a goal of the Semantic web)? That's another point of current state-of-art of the Semantic Web: I am not an expert, but aren't we still missing some needed features? Like: efficient RDF handling, SQL mapping, federated data stores, distributed reasoning... relational tehory is less expressive, but for the moment, relational databases, having been here for ages, seem more reliable and efficient. Sorry for the length of the reply... Cheers. -- =============================================================================== Marco Brandizi <brandizi@ebi.ac.uk> http://gca.btbs.unimib.it/brandizi
Received on Sunday, 10 September 2006 12:21:51 UTC