RE: Notes from informal Demo F2F from Nigam Shah on 2007-03-07 (public-semweb-lifesci@w3.org from March 2007)

From: Nigam Shah <nigam@stanford.edu>
Date: Tue, 6 Mar 2007 21:47:01 -0800
To: "'Tim Clark'" <twclark@nmr.mgh.harvard.edu>, "'Alan Ruttenberg'" <alanruttenberg@gmail.com>
Cc: "'William Bug'" <William.Bug@DrexelMed.edu>, "'Eric Neumann'" <eneumann@teranode.com>, <public-semweb-lifesci@w3.org>
Message-ID: <000d01c7607c$0823e5d0$6901a8c0@stanford.edu>
Hi Alan,
 
Tim and Steve (the other HyBrow guy) have discussed this in some
detail I believe. Its great to have formally structured
representations at the level of biological entities created by the
scientists themselves. However, right now (A) The ontologies and
relations for doing so might or might not exist (B) Most people fall
asleep when we start talking about BFO driven formal representations
(C) there are no incentives for someone to take that much trouble (D)
the UI's for creating the formal descriptions are quite bad.
 
Overall, I think it is more important to get people to curate (however
informal, disconnected from biological entities it might be) and to
get them to "donate" a few keystrokes and clicks towards curation. IF
this works, we can, in the next step think about aquiring more deep,
structured, formal descriptions of hypotheses.
 
Regards,
Nigam.


Alan, 

Thanks for that summary. 

SWAN is not a software project - it is a project for a
technology-mediated social infrastructure. Those fancy words mean we
are trying to get the working scientists themselves to curate the
material for their own benefit, self-understanding & "karma points".
We think it is the only sustainable & scalableway to represent
hypotheses. 

Nothing in principle - and I hope we can test this out - should
prevent people who want to link very deep annotation of various kinds
to SWAN hypotheses, from doing so. But that is not our project. We are
"just" providing the framework. An essential criterion is that
whatever we do is doable by working scientists and helps them to work.


Best

Tim

On TuesdayMar 6, 2007, at 9:59 PM, Alan Ruttenberg wrote:


Here's my analysis:

I think Bill wants (1). Bill is trying to convince the SWAN team to
curate to a level deeper than you are going now, by naming and
collecting relationships between the biological entities that are
being talked about in the paper SWAN curates. The page he references
is detailing his own effort to simultaneously learn for himself and
engage us (HCLS) to try to explicitly representing more of the
biological facts/hypotheses at the level of the participating
biological entities. PATO, the ontology he is trying to learn how to
use, is just one of the OBO ontologies - its domain is phenotypes (or
qualities in BFO speak) of organisms and properties of some
sub-organismal biological material. He thinks that there is a lot a
work in the rest of the OBO ontologies and other community efforts,
and that this should be taking advantage as much as possible in our
efforts. 

He's also coming from perspective, which I think originates with Barry
Smith, that many of the problems we have had in managing and making
productive use of information related to biology and health in recent
history originate from not trying to represent reality, and instead
representing "concepts" which, because they don't connect clearly
enough to reality, don't land up having enough shared meaning to make
good use of when integrated or scaled. 

My understanding from discussions we (Tim and I) have had is that the
SWAN project has, as a matter of setting scope, and addressing its use
cases, decided against doing this, at least for the moment. This is an
understandable choice - it's pretty hard work to learn and do this
kind of representation, as practices and the ontologies themselves are
just developing. I think this is a reasonable decision, given that
what Bill is suggesting is by no means easy. I'm curious to see what
happens with SWAN. I suspect that Bill is too!

Still, Bill is hopeful that the SWAN team will change its mind,
because he thinks, as some subset of the people on this list, that his
kind of representation will be what leads to the most substantial
payback in what it enables for science and medicine.

Hope this helps,

Alan


On Mar 6, 2007, at 9:10 PM, Tim Clark wrote:


BIll: 

(1) or (2) or none of the above is good enough for right now. I am
finding your proposal difficult to follow.

Tim

On TuesdayMar 6, 2007, at 7:43 PM, William Bug wrote:


Sorry, Tim. 

Can't really go into more detail right now. I have a lot of planning
still to do on an all day meeting I must lead tomorrow.

I lay it out considerable detail on this proposal on that page I cite
below:


http://esw.w3.org/topic/HCLS/OntologyTaskForce/OboPhenotypeSyntaxExper
iment


It is just a suggestion. As I said a few weeks ago when I put it out
there, I welcome any feedback. Please amend, append, or correct as you
see fit.

As I mentioned to you a few weeks ago, I'd see this as a way of
providing much more structure to back up the "Concepts" and "Claims"
that are represented in SWAN. In fact, the "Concepts" (as represented
in RDF using community shared ontologies/terminologies) provide a link
into this more structure "bridge" I'm describing and the wealth of
detail contained in RDF converted versions of BioPAX, SenseLab
(BioPharm), ABA, MPO-based annotations from MGI & RDG, etc.

I hope this helps a little.

Cheers,
Bill


On Mar 6, 2007, at 4:33 PM, Tim Clark wrote:


Bill, 

I am trying to understand your proposal. Which are you suggesting:

(1) we curate in to SWAN some existing published work hypothesizing
connection of, for example, MPTP/MPP+ mechanism to some forms of PD;
or
(2) we build "our own" hypothesis of MPTP/MPP+ mechanism relationship
etc, not existing in the literature, and curate it in to SWAN?

or something else?

Tim

On TuesdayMar 6, 2007, at 7:25 PM, William Bug wrote:


Hi All, 

Looks like a lot of substantive work was done at the F2F. Kudos to all
who participated!

I'd like to highlight one of the issues EricN mentioned. 

On Mar 6, 2007, at 8:29 AM, Eric Neumann wrote:

As part of the scernario using the known aggregate of facts, add a few
*select* hypotheses (triple graphs), that would make major connections
with the rest of the graph that would function as a "bridge" across
the data and models; Show the new insights from this merged
compositeby re-applying queries that now retireve more connections.
One example Karen had was around the MPTP/MPP+ mechanism for some
forms of PD.


This suggestion that came from the off-line discussion amongst several
call-in participants is EXACTLY the point I've been trying to make
since September with the proposal to use the OBO Foundry PATO +
Phenotype assertion syntax.
http://esw.w3.org/topic/HCLS/OntologyTaskForce/OboPhenotypeSyntaxExper
iment

I think this is critical to bringing together the various resources
around complex concepts such as LTP/LTD - which, as I've mentioned
before is a MODEL not a fact per se.

The advantage to using this approach is your assertions are based on
reported evidence from the literature - not on a high-level
encapsulation of an abstraction in the form of a complex model.

The strategy I'm proposing is only contrived in the sense you focus in
specifically on a collection of articles covering a particular micro
domain within the general use case. I've even proposed a way in which
one could determine a metric to decide exactly how much of this sort
of highly structured curation is required. The amount will likely be a
function of the complexity and abstraction in the underlying
hypothesis and the extent to which the underlying RDF sources are
already inter-liked via shared semantic frameworks such as MeSH, GO,
BioCyc, etc.

I would note the article I chose as an example was appropriate given
the PD use case as of September 2006. It was mainly put out there to
illustrate how to approach this task. We'd now want to focus
specifically on articles that cover the specific micro domains in the
most recent, narrowed version of the use case.

I have been working on how to use tools such as SWOOP to greatly
reduce the effort required to construct these phenotype assertions.

I'm afraid I'm busy for the next week with BIRN meetings - some of
which I need to lead - so I don't expect to be able to provide much
help on this until late next week.

Best of luck!

Cheers,
Bill



Bill Bug
Senior Research Analyst/Ontological Engineer

Laboratory for Bioimaging & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA 19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - William.Bug@DrexelMed.edu








Bill Bug
Senior Research Analyst/Ontological Engineer

Laboratory for Bioimaging & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA 19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - William.Bug@DrexelMed.edu
Received on Wednesday, 7 March 2007 05:47:42 UTC