- From: Peter Ansell <ansell.peter@gmail.com>
- Date: Wed, 13 Feb 2008 06:37:15 +1000
- To: public-semweb-lifesci@w3.org
- Cc: "Colin Batchelor" <BatchelorC@rsc.org>, "samwald@gmx.at" <samwald@gmx.at>, holger.stenzhorn@deri.org, p.roe@qut.edu.au, j.hogan@qut.edu.au
On 12/02/2008, samwald@gmx.at <samwald@gmx.at> wrote: > > > Good point. What I was sort of driving at (and failing) was the context > > in which the facts are mentioned---are they the aim of the paper, > > background information, mentioned as results and so forth? > > In what I see as the ideal scenario, each text/database entry would only be > annotated with the results and not with background information or the > citation of secondary sources. The annotations should only consist of what > the creators of the annotated text/database entry considered to be 'true', and what has not already been stated in other texts/database entries. I don't think there is harm in placing a list of references in with each entry. It gives publications a context in the overall knowledge. I agree that it is not necessary to put every detail of knowledge into publication annotations, but if a text-mining computer is going to help with the annotation process they are likely to attempt to do it all. Seems unavoidable that the end product will be annotations which contain all knowledge, in anticipation of being able to find out what part of it is unique through another method, or do that dynamically when one wants to view the information. > For example, article pmid:123 contains the text > "We found that bananas are yellow. This is in conflict with article > pmid:456, which states that bananas are pink". > > Article pmid:123 should only be annotated with > "banana has_quality yellow . > pmid:123 in_conflict_with pmid:456 ." > Article pmid:456 should be annotated with > "banana has_quality pink ." > > "banana has_quality pink" should not be re-iterated in the annotation of > pmid:123. > > For widely-accepted and deployed biomedical annotation it really does not need to be more complicated than that. Trying to capture the whole narrative of each publication in detail in such a scenario is very hard and might actually make the annotation as a whole less usable, since it becomes harder to understand, query, integrate and maintain. On a broader brainstorming note, it would be nice to have a way of specifying that a certain dc:Agent thinks that one is a better annotation than the other also, with the user deciding to trust certain Agents to give them useful knowledge, or inversely, to not trust specific Agents who they find do not annotate things well. I think the probabalistic ontology concept (links below) is a useful starting point for determining trust at the annotation level by pooling metadata from comments about annotations to provide a somewhat democratic distribution. Hopefully this would not conflict with the simple is best principle, rather allowing for multiple opinions about the validity of a certain scientific statement or not. http://www.pr-owl.org/pr-owl.owl http://ite.gmu.edu/~klaskey/papers/FOIS2006_CostaLaskey.pdf Peter
Received on Tuesday, 12 February 2008 20:37:36 UTC