DRAFT minutes -- Re: HCLS Scientific Discourse Call Monday, 11 April 2011 10 am EST, 3 PM BST: talk by Jodi Schneider from Jodi Schneider on 2011-04-12 (public-semweb-lifesci@w3.org from April 2011)

From: Jodi Schneider <jodi.schneider@deri.org>
Date: Tue, 12 Apr 2011 17:17:45 +0100
To: Jodi Schneider <jodi.schneider@deri.org>
Cc: "Waard, Anita de A (ELS-AMS)" <A.dewaard@elsevier.com>, "barend mons" <barend.mons@nbic.nl>, "M. Scott Marshall" <mscottmarshall@gmail.com>, "Tim Clark" <tim_clark@harvard.edu>, "HCLS IG" <public-semweb-lifesci@w3.org>, "Alberto Accomazzi" <aaccomazzi@cfa.harvard.edu>, "Sophia Ananiadou" <Sophia.Ananiadou@manchester.ac.uk>, "Philip Bourne" <bourne@sdsc.edu>, "Gully Burns" <gully@usc.edu>, "Daniel, Ronald (ELS-SDG)" <R.Daniel@elsevier.com>, "Rahul Dave" <rahuldave@gmail.com>, "Alf Eaton" <A.Eaton@nature.com>, "Alexander Garcia Castro" <alexgarciac@gmail.com>, "Matthew Gamble" <matthew.gamble@gmail.com>, "Yolanda Gil" <gil@isi.edu>, "Alyssa Goodman" <agoodman@cfa.harvard.edu>, "Paul Groth" <pgroth@gmail.com>, "Tudor Groza" <tudor.groza@deri.org>, "Hays, Ellen (ELS-BUR)" <E.Hays@elsevier.com>, "Maryann Martone" <maryann@ncmir.ucsd.edu>, "David R Newman" <drn05r@ecs.soton.ac.uk>, "Scerri, Antony (ELS-CAM)" <A.scerri@elsevier.com>, "Jack Park" <jackpark@gmail.com>, "Silvio Peroni" <speroni@cs.unibo.it>, "Steve Pettifer" <steve.pettifer@manchester.ac.uk>, "Philippe Rocca-Serra" <proccaserra@googlemail.com>, "Cartic Ramakrishnan" <cartic@isi.edu>, "RebholzSchuhmann" <d.rebholz.schuhmann@gmail.com>, "David Shotton" <david.shotton@zoo.ox.ac.uk>, "Kaitlin Thaney" <k.thaney@digital-science.com>, "Karin Verspoor" <Karin.Verspoor@ucdenver.edu>, "Lynette Hirschman" <lynette@mitre.org>, "Susanna-Assunta Sansone" <sa.sansone@gmail.com>, "Kees van Bochove" <business@keesvanbochove.nl>, "Katy Wolstencroft" <katy@cs.man.ac.uk>, "Jun Zhao" <jun.zhao@zoo.ox.ac.uk>, "Paul Groth" <pgroth@few.vu.nl>
Message-Id: <D08C9F10-247D-42E2-9839-8094D195C7A2@deri.org>

I wrote up draft "minutes" from our HCLS Scientific Discourse call yesterday. Please update & edit them on the wiki:
http://www.w3.org/wiki/HCLSIG/SWANSIOC/Actions/RhetoricalStructure/alignment/mediumgrain#2011-04-14_HCLS_meeting

Thanks for a fruitful discussion!
-Jodi
=====

Howard: It seems like either paragraph or figure could serve a variety of semantic or rhetorical roles.

Tim: Blocks and [clauses?]??? play different rhetorical roles in the document, so you may want to annotate them differently.

Anita: I thought that "fine-grained" was what SWAN annotated: recitation, evidence, claims.

You can't always point to a single place where a claim is stated. It may be stated several places, and narrative-focused.

So is the challenge: how to connect fine-grained *process* markup to fine-grained *document* markup?

Gully: There's a difference between the scientific process and the construction of argument across documents.

In Slide 11 [on SWAN] -- a lot of elements could be distinguished into categories
It could be useful to categorize the elements of as
* Intra-document
* Intra-experiment
* Extra-document
Even within a document, there may be cross-experiment argumentation.

Science draws from an external set of knowledge -- to provide motivation, interpretation, experimental design.

There are also intra-document relationships -- like pointing to figures, or constructing an experimental design that is specifically geared towards a certain question.

Further, a document may contain multiple experiments.

Tim: Science is "warranted true belief". There are two kinds of warrant: either a reference to someone else's papers (external evidence) or reference to the experiment - details, reasoning, data etc. (internal evidence relying on the provenance and logical procedures of the papers).

Gully: Tim - I think my point was just that the slides have these different types of relations in one place. I'd like to categorize these types, separating extra-document knowledge from intra-document evidence and process.

At ISI, the machine-reading group works on this problem, they distinguish between the "reading machine" and the "reasoning machine": what do we annotate the text with; how do we use those annotations to reason with?

There are process elements and semantic representation. For discourse (on the process side) -- what are the general aspects of what we're trying to annotate, in order to serve the purpose of reasoning? We want to find the underlying reasoning and do computation over it.

Anita: The question "what are we modelling?" needs to be proceeded by "why are we modelling?". Do we need new use cases?

The existing use cases are to:

Create a claim-evidence network spanning over documents
http://www.w3.org/wiki/HCLSIG/SWANSIOC/Actions/RhetoricalStructure/UseCases/1

Provide templates for authoring of documents (Use Case #2: Publication authoring here)
http://www.w3.org/wiki/HCLSIG/SWANSIOC/Actions/RhetoricalStructure/UseCases/2

Link publications and enhance search:
http://www.w3.org/wiki/HCLSIG/SWANSIOC/Actions/RhetoricalStructure/UseCases/3

Jodi: Integration between the paper and the science is the main point here - so how do you get claims out of a single paper?
Anita: and link them to the underlying experimental model?

Gully: The use cases we are talking about are informatics use cases, but we should be focusing on biologists use cases. Can we suggest to an biologist end-user what a new field or direction would be? To make a claim network useful in practice, by taking them to a place where they haven't worked before?

Tim: great point Gully - also point them to experiments and tools to do that work?

Alex: when biologists go to libraries, they have a particular problem, are following a protocol, want to find papers where same protocol is used with slight modifications - not so much claims.

Gully: The ultimate goal is for scientist to be able to design a new experiment based on the knowledge you have based your model on. We want machines that make a prediction for the scientist, and propose a new experiment. So pull out all claims about observations and interpretations, munge them together into a model that you can do reasoning on, and generate an experimental design to test the new hypothesis.

Gully: Hanalyzer, from UC Denver, is an interesting system to consider. http://hanalyzer.sourceforge.net/
http://www.youtube.com/watch?v=jAegU3aZbWI

They work in a well-defined domain which they understand well, and besides the tools, they have a human analyst to "think about the science". This is a good recipe for success: solve problems for a specific domain.

There are two main differences from what we are doing: first, it is a complete system (with user interfaces) NOT just an ontology. Second, it is specific, whereas what we want is very general

Joanne: So how do you bridge perspectives, between a domain and external [general?] perspectives? You need to know what role you are in. Different types of evidence are used in different fields.

Next steps from Anita:
1) investigate workshop
2) write new use case
3) Joanne to invite someone from the Hunter group

Other possible next-steps: presentation from the ISI machine-reading group.

Other thoughts to take up:
Howard: Not sure stories or narratives always strictly follow beginning-middle-end series: roles often shift about with lots of non-linear reflection and impact out-of-sequence.

Received on Tuesday, 12 April 2011 16:18:17 UTC