Re: [HCLSIG] homologs for yeast proteins? from Pat Hayes on 2006-02-01 (public-semweb-lifesci@w3.org from February 2006)

From: Pat Hayes <phayes@ihmc.us>
Date: Wed, 1 Feb 2006 12:41:00 -0600
To: Eric Miller <em@w3.org>
Cc: chris mungall <cjm@fruitfly.org>, Alf Eaton <lists@hubmed.org>, public-semweb-lifesci@w3.org
Message-Id: <p06230915c006ade2dfb3@[10.100.0.23]>

>On Jan 31, 2006, at 5:05 PM, chris mungall wrote:
>
>>
>>On Jan 31, 2006, at 1:46 PM, Eric Miller wrote:
>>
>>>
>>>
>>>On Jan 31, 2006, at 3:56 PM, Alf Eaton wrote:
>>>
>>>>
>>>>Eric Jain wrote:
>>>>>Joanne Luciano wrote:
>>>>>>An MD I met with last week mentioned briefly the desire to 
>>>>>>obtain homologs
>>>>>>for yeast proteins with the GO ID 0031930.  With it was a 
>>>>>>request for other
>>>>>>suggestions for querying proteins with this ontology assignment 
>>>>>>(looking for
>>>>>>mammalian homologs).
>>>>>>
>>>>>>Can the semantic web help with this or is it already basic and solved, in
>>>>>>which case, can someone point me or fill in the details?  Where 
>>>>>>should I go?
>>>>>>What questions should I  ask?
>>>>>Semantic web technologies can help make certain things simpler 
>>>>>to implement -- and therefore allow things to be implemented 
>>>>>that would not have been feasible otherwise. But tasks such as 
>>>>>listing all proteins for an organism that have been tagged with 
>>>>>a specific term are simple enough to be supported even without 
>>>>>any semantic web magic, see http://pir.uniprot.org/, for 
>>>>>example. This site also allows you to run similarity searches on 
>>>>>the matching entries to find homologs.
>>>>>Now in this particular case it looks like you are out of luck: 
>>>>>No yeast proteins have been associated with GO:0031930. Often 
>>>>>the main limiting factor are not the tools, but the available 
>>>>>data...
>>>>
>>>>There are S. cerevisiae proteins assigned to GO:0031930 though, 
>>>>you can see them here: <http://tinyurl.com/d2dm9>
>>>
>>>I started writing a piggy bank [1] scraper using solvent [2] to 
>>>make explicit the data thats implied in the HTML tables when I 
>>>noticed the XML link [3] at the bottom of the page.
>>
>>surely screen scraping should be a last resort, such as when the 
>>underlying database has no APIs or download option
>
>Agreed!
>
>>all the data underlying amigo is available in a number of hopefully 
>>more sensible ways. See http://www.godatabase.org/dev
>
>Good to know. Do you think people just "know" the fact that the 
>underlying amigo is available? The link is helpful at the bottom of 
>the page, but a *consistent* convention this community could adopt 
>to clearly associate the data corresponding to the HTML page would 
>be extremely valuable.
>
>More specifically, I'd suggest for this page, the following template be used
>
><link rel="meta" type="application/rdf+xml" title="Data for 
>GO:0031930" 
>href="http://www.godatabase.org/cgi-bin/amigo/go.cgiformat=xml&view=details&search_constraint=terms&depth=0&session_id=412b1138742957&query=GO:0031930" 
>/>
>
>and help define such a consistent linking mechanism.

Seems like RDF/A should be relevant here. It will be a draft 
recommendation very soon, time to get ahead of the game? See 
http://www.w3.org/2001/sw/BestPractices/HTML/2005-rdfa-syntax

Pat


-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes

Received on Wednesday, 1 February 2006 18:41:09 UTC