- From: Talapady Bhat <talapady.bhat@nist.gov>
- Date: Fri, 12 Sep 2008 09:03:59 -0400
- To: "eric neumann" <ekneumann@gmail.com>, "Amit Sheth" <amitpsheth@gmail.com>
- Cc: "Kei Cheung" <kei.cheung@yale.edu>, "Peter Ansell" <ansell.peter@gmail.com>, "w3c semweb hcls" <public-semweb-lifesci@w3.org>, "Talapady N Bhat" <bhat@nist.gov>
- Message-ID: <017d01c914d8$05a5e760$83c20681@campus.nist.gov>
Hi, I full agree with Eric. One of the solutions (at least in part) is a 'use case' based ontology built over SW/RDF data with features that are fine tuned to R&D applications. An example is our HIV Structural database http://bioinfo.nist.gov/SemanticWeb_int2d/chemblast.do that provides a complete pre-clinical data for AIDS research build over RDF concepts with structural data as the starting point. By this process, the data from independently maintained NIAID & NCBI are integrated with data from our database. Recently, we have extended this work to all the entries in the Protein Data Bank. (http://xpdb.nist.gov/pdb/chemblast.html). This interface acts as a ligand gateway for the inhibitor data held in the PDB - inhibitor data is a major R&D component of the PDB. A user can query our Web site examine the inhibitor structure that resulted from the query, and also review the basic information to get an idea on the function of the protein to which it binds. If it is the target of his interest, the user can then use the hyperlink to connect up to the PDB for additional information. We also group the PDB entries based on special topics (such as Biofuels, SARS, Cancer,..) that are of general R&D interest. An additional novelty of the method is that all queries on the structures are done using images. The use of molecular images for query greatly simplifies the query by circumventing the use of complicated names of chemical structures in a query. The different RDF based connections between data elements allow users to formulate individual questions of his choice by providing specific branching points at each RDF junctions. Thus a user can compose his own questions as he builds his query in a couple of steps. I want to add that the work on the PDB data is a still in the developmental stage and at this time it only a demonstration model for the concept. I have talked to some key people at the PDB they seem to be very interested in this work. Some of the DOE staff are also interested in this approach as a means to integrate structural data across many databases. This PDB work is a follow up in part of the discussion Eric, Phil Bourne and me had in Hawaii in 2008 Jan. Cheers, T N Bhat ----- Original Message ----- From: eric neumann To: Amit Sheth Cc: Kei Cheung ; Peter Ansell ; w3c semweb hcls Sent: Thursday, September 11, 2008 7:39 PM Subject: Re: An application of the Semantic Web for finding alternative drug applications Amit, There are a large class of data discovery problems that cannot be solved via (SPARQL) query or even inferencing, simply because we don't know what precise questions to ask in advance, though we may already have enough evidence at hand (stored). These kinds of problems (limited associative models) lend themselves more to applications of large-scale statistical mining and bayesian modeling... However, the latter tools are usually applied when one is looking at 1-4 parameters at a time (e.g., incidence of a disease is dependent on genetic factors, age, and diet). Now with the possibility of having hundreds of such different attributes available semantically, statistical approaches will have to be augmented greatly. SPARQL is an essential component for accessing/constraining SW data, but by itself it is insufficient for the discovery of new associations and mechanisms in whole biological systems. Many of the major challenges within pharma R&D are precisely of this type! cheers, Eric On Thu, Sep 11, 2008 at 1:24 PM, Amit Sheth <amitpsheth@gmail.com> wrote: Finding "potentially interesting" paths, subgraphs, and pattering in semantic web data (eg those created from complex entity and relationship extraction from biomedical literature [1], semantic annotation and provenence of experimental data, and of course structured datatabases) is very useful in biomedical research and requires SPARQL extensions. One of several examples along this line is the support for path queries as in SPARQ2L [2]. Other interesting examples are supporting spatio-temporal thematic queries and corresponding extensions such as SPARQ-ST [3] albeit we have not applied these extensions to sensor data so far and not (yet) to biomedical domain. Amit [1] http://knoesis.wright.edu/research/semweb/projects/textMining/ekaw2008/ [2] http://knoesis.wright.edu/library/resource.php?id=00060 [3] http://knoesis.org/research/semweb/projects/stt/ On Thu, Sep 11, 2008 at 10:50 AM, Kei Cheung <kei.cheung@yale.edu> wrote: Peter Ansell wrote: ----- "Kei Cheung" <kei.cheung@yale.edu> wrote: From: "Kei Cheung" <kei.cheung@yale.edu> To: "eric neumann" <ekneumann@gmail.com> Cc: "w3c semweb hcls" <public-semweb-lifesci@w3.org> Sent: Thursday, September 11, 2008 6:42:33 AM GMT +10:00 Brisbane Subject: Re: An application of the Semantic Web for finding alternative drug applications Thanks for sharing the papers, Eric. I went through some of the papers including the one you mentioned (interestingly there is a paper on wiki). I think they're interesting. They reminded me of "mining for the semantic web" (ontology learning?) and "mining from the semantic web" (data mining). For biological networks, we need to do both semantic and topological queries. It might be difficult to achieve the latter using SPARQL (e.g., finding protein hubs). Maybe we need some extensions of SPARQL. Best, -Kei What are the limits to what you can do with bare SPARQL in this area? Does it help to have elementary rdfs subclass knowledge for the topological parts? Cheers, Peter Hi Peter, When YeastHub [1] was being built, I was wondering whether Semantic Web (SW) technologies can help facilitate integrative biological network analysis including network topology. Later, a web-based tool called "tYNA" was created and published [2] which supports biological network analysis/visualization. tYNA was not implemented using SW, but I still wonder how some of its features can be implemented using SW. [1] http://bioinformatics.oxfordjournals.org/cgi/reprint/21/suppl_1/i85 [2] http://bioinformatics.oxfordjournals.org/cgi/content/full/22/23/2968 Cheers, -Kei -- Amit Sheth http://knoesis.org
Received on Friday, 12 September 2008 13:05:01 UTC