W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > January 2006

Re: [HCLSIG] homologs for yeast proteins?

From: Eric Miller <em@w3.org>
Date: Tue, 31 Jan 2006 17:41:51 -0500
Message-Id: <BFBCD5E3-8656-4131-9CEE-3F5677188FDA@w3.org>
Cc: Alf Eaton <lists@hubmed.org>, public-semweb-lifesci@w3.org
To: chris mungall <cjm@fruitfly.org>

On Jan 31, 2006, at 5:05 PM, chris mungall wrote:

> On Jan 31, 2006, at 1:46 PM, Eric Miller wrote:
>> On Jan 31, 2006, at 3:56 PM, Alf Eaton wrote:
>>> Eric Jain wrote:
>>>> Joanne Luciano wrote:
>>>>> An MD I met with last week mentioned briefly the desire to  
>>>>> obtain homologs
>>>>> for yeast proteins with the GO ID 0031930.  With it was a  
>>>>> request for other
>>>>> suggestions for querying proteins with this ontology assignment  
>>>>> (looking for
>>>>> mammalian homologs).
>>>>> Can the semantic web help with this or is it already basic and  
>>>>> solved, in
>>>>> which case, can someone point me or fill in the details?  Where  
>>>>> should I go?
>>>>> What questions should I  ask?
>>>> Semantic web technologies can help make certain things simpler  
>>>> to implement -- and therefore allow things to be implemented  
>>>> that would not have been feasible otherwise. But tasks such as  
>>>> listing all proteins for an organism that have been tagged with  
>>>> a specific term are simple enough to be supported even without  
>>>> any semantic web magic, see http://pir.uniprot.org/, for  
>>>> example. This site also allows you to run similarity searches on  
>>>> the matching entries to find homologs.
>>>> Now in this particular case it looks like you are out of luck:  
>>>> No yeast proteins have been associated with GO:0031930. Often  
>>>> the main limiting factor are not the tools, but the available  
>>>> data...
>>> There are S. cerevisiae proteins assigned to GO:0031930 though,  
>>> you can see them here: <http://tinyurl.com/d2dm9>
>> I started writing a piggy bank [1] scraper using solvent [2] to  
>> make explicit the data thats implied in the HTML tables when I  
>> noticed the XML link [3] at the bottom of the page.
> surely screen scraping should be a last resort, such as when the  
> underlying database has no APIs or download option


> all the data underlying amigo is available in a number of hopefully  
> more sensible ways. See http://www.godatabase.org/dev

Good to know. Do you think people just "know" the fact that the  
underlying amigo is available? The link is helpful at the bottom of  
the page, but a *consistent* convention this community could adopt to  
clearly associate the data corresponding to the HTML page would be  
extremely valuable.

More specifically, I'd suggest for this page, the following template  
be used

<link rel="meta" type="application/rdf+xml" title="Data for GO: 
0031930" href="http://www.godatabase.org/cgi-bin/amigo/ 
=412b1138742957&query=GO:0031930" />

and help define such a consistent linking mechanism.

Note: This pattern is not to preclude the useful, continued  
development folks like the GO Consortium are providing via APIs and/ 
or download option, but rather provide an additional, consistent  
mechanism for associating the "raw" data with a corresponding page.

>> Turns out this "XML" link points to what looks like RDF/XML (but  
>> not quite). Anyone know who to contact to get this corrected?  
>> Bonus points if you're willing to own this action item and be the  
>> person to make the contact :)
>> [1] http://simile.mit.edu/piggy-bank/
>> [2] http://simile.mit.edu/solvent/
>> [3] http://www.godatabase.org/cgi-bin/amigo/go.cgi? 
>> format=xml&view=details&search_constraint=terms&depth=0&session_id=41 
>> 2b1138742957&query=GO:0031930
> I'll take care of this, on the TODO list anyway
> We'll also be adding an OWL option too sometime

Excellent! Thanks!

eric miller                              http://www.w3.org/people/em/
semantic web activity lead               http://www.w3.org/2001/sw/
w3c world wide web consortium            http://www.w3.org/
Received on Tuesday, 31 January 2006 22:41:50 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:20:11 UTC