- From: Alan Ruttenberg <alanruttenberg@gmail.com>
- Date: Thu, 4 May 2006 01:29:42 -0400
- To: public-semweb-lifesci@w3.org
- Cc: Eric Jain <Eric.Jain@isb-sib.ch>
Here is a typical problem that arises when translating databases. I'm reviewing the species that are mentioned as being reactive to various of the antibodies in the Alzforum database. One of them is mussels. Q: what should this resolve to? I'm looking at the NCBI taxonomy. There are a bunch of mussels and they are not neatly organized: You can see a number of them at http://www.ebi.ac.uk/newt/display Search for mussel. Or you can search for them at NCBI: "*mussel*" and select "wild card" from the search pop up. Here's a list, after I reviewed them quickly to remove things that obviously were not mussels (like viruses etc) Hemistena lata [genbank common name: cracking pearlymussel] Dromus dromas [genbank common name: dromedary pearlymussel] Lithophaga nigra [genbank common name: black datemussel] Musculus discors [genbank common name: discordant mussel] Hormomya domingensis [genbank common name: Santo Domingo mussel] Modiolus americanus [genbank common name: American horsemussel] Musculus lateralis [genbank common name: lateral mussel] Mytilus californianus [genbank common name: California mussel] Mytilus edulis [common name: blue mussel] Mytilus trossulus [genbank common name: common blue mussel] Sparodon durbanensis [common name: musselcracker seabream] Lithophaga [genbank common name: date mussels] Modiolus [genbank common name: horsemussels] Mytilus galloprovincialis [genbank common name: Mediterranean mussel] Geukensia demissa [genbank common name: ribbed mussel] Perna canaliculus [genbank common name: greenshell mussel] Hormomya exustus [genbank common name: scorched mussel] Modiolus modiolus [genbank common name: northern horsemussel] Dreissena bugensis [genbank common name: quagga mussel] Dreissena polymorpha [common name: zebra mussel] Mytilopsis leucophaeata [genbank common name: dark false mussel] Margaritifera margaritifera [genbank common name: freshwater pearlshell mussel] Lexingtonia dolabelloides [common name: slabside pearlymussel] Lemiox rimosus [common name: birdwing pearly mussel] Popenaias popeii [common name: Texas hornshell mussel] Strophitus subvexus [common name: Southern creekmussel] Venustaconcha pleasii [common name: bleedingtooth mussel] So there is no single class that will capture all these. Also, most likely there is more information that could be gathered from the curators, that might narrow this down, but that's certainly impractical when automatically translating, and probably impractical even when manually annotating. I can do a further narrowing of the above set by selecting only those which are mollusks (though neither newt or the ncbi search box seem to provide this ability) I am inclined to create a class which is the union of all these classes and then annotate the antibody with that class. Anyways, advice about how to handle this case would, I think, carry over to many other situations we will hit. -Alan ps. for your amusement, here are (most) of the rest of the organisms named in this database. Modeling of the terms marked "various" is another issue that I need to deal with. sheep dog yeast mink mammal (various) frog clam sea star drosophila baboon mussel coral goat fish (various) zebra finch macaque Dictyostelium S. pombe kangaroo Arabidopsis microorganism (various) E. Coli marsupial allium eukaryote slime molds plant (various) nematode invertebrate (various) Ascidian Nicotiana tabacum trypanosome syrian hamster Chlamydomonas reinhardtii insect myxo- mycetes Parasites ox alligator tortoise opossum reptile (various) salamander donkey cynomolgus non-specific polar bear torpedo eel electrophorus turtle bear ram fetal calf rhesus dolphin guinea dog snail octopus calf whale seal lamprey squid viper raccoon mustelid plasmodium myxamoeba eukaryote african green monkey kangaroo rat protozoa corn (zea mays) Pisum sativum wheat moth Chlamydomonas Fungi dinoflagellate S. cerevisiae Arabidopsis thaliana equine various animal chinese hamster bat amphibian (various) sea urchin goldfish rodent (various) ferret lizard gerbil avian (various) pufferfish chimpanzee guinea pig carp primate (various) ovine various species C. elegans various vertebrate hamster zebra fish monkey bovine swine horse rabbit porcine cat cow quail chicken pig xenopus human rat mouse
Received on Thursday, 4 May 2006 05:29:55 UTC