W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > November 2010

Re: Aligning DoCO and the middle grained document structure

From: Tim Clark <tim_clark@harvard.edu>
Date: Wed, 24 Nov 2010 16:06:13 -0500
Cc: HCLS IG <public-semweb-lifesci@w3.org>, David Shotton <david.shotton@zoo.ox.ac.uk>, Paolo Ciccarese <paolo.ciccarese@gmail.com>, "M. Scott Scott Marshall" <mscottmarshall@gmail.com>, "John F. Madden" <john.madden@duke.edu>, Alberto Accomazzi <aaccomazzi@cfa.harvard.edu>, Sophia Ananiadou <Sophia.Ananiadou@manchester.ac.uk>, Gully Burns <gully@usc.edu>, "Ronald (ELS-SDG) Daniel" <R.Daniel@elsevier.com>, Rahul Dave <rahuldave@gmail.com>, Anita de Waard <A.dewaard@elsevier.com>, Alf Eaton <A.Eaton@nature.com>, Alyssa Goodman <agoodman@cfa.harvard.edu>, Paul Groth <pgroth@gmail.com>, Tudor Groza <tudor.groza@deri.org>, ellen hays <E.Hays@elsevier.com>, "Antony (ELS-CAM) Scerri" <A.scerri@elsevier.com>, Jack Park <jackpark@gmail.com>, Silvio Peroni <speroni@cs.unibo.it>, Philippe Rocca-Serra <proccaserra@googlemail.com>, Karin Verspoor <Karin.Verspoor@ucdenver.edu>, Lynette Hirschman <lynette@mitre.org>, Susanna-Assunta Sansone <sansone@ebi.ac.uk>, Jun Zhao <jun.zhao@zoo.ox.ac.uk>, "Joanne Luciano (gmail)" <jluciano@gmail.com>, Sudeshna Das <sudeshna_das@harvard.edu>, David R Newman <drn05r@ecs.soton.ac.uk>, Alexander Garcia Castro <alexgarciac@gmail.com>
Message-Id: <BE57E9AF-A3F3-4E86-82E5-CC9B93920217@harvard.edu>
To: Jodi Schneider <jodi.schneider@deri.org>
Hi Jodi,

Thanks very much for this material, great job of sorting these issues. Sorry I couldn't join you on the call.

I want to chime in with my views on the *order of alignment*, which I think is important.  

I believe we want  to align the most firmly-grounded (in both use cases and inter-group discussions) and simplest ontologies first, because they are more foundational and straightforward.  You want to create a base, and then align step by step to that.  

On that principle I would proceed as follows:

1 - ORB + DRO first. (ORB + DRO = DRO' ).  I think this is important to do first because it is so simple.  ORB has also had the most discussion, very well-grounded in use cases. And there is another reason to do ORB + DRO first...

2 DRO' + Data-Experiment (DEXI) next.  (DRO' + DEXI = DRO''). Why?  Because 
(a) DEXI has already been under construction for a year, and includes a lot of grounding and cross group discussion, so it is quite well-grounded; and 
(b) DRO and DEXI have a definite and strong overlap, that can impact on details further on, when you bring along DOCO.  

3 DRO'' + DOCO.  

Best

Tim

On Nov 24, 2010, at 2:10 PM, Jodi Schneider wrote:

> Here's what we discussed in our call yesterday. Overall we're looking to discuss and align DoCO, ORB, DRO, and the Middle-grained document structure, in the context of life sciences research papers.
> 
> We plan to talk again on Tuesday 7th Dec at 10 EST / 3 PM GMT (phone number TBA). If you're interested, could you please let me know (off-list, jodi.schneider@deri.org)? We may be able to adjust the time in the future.
> 
> -Jodi
> 
> ========
> Yesterday Anita, Alex Garcia and I discussed the possibilities for alignment between DoCO [1] and Medium-grained document structure [2]. DoCO is currently being developed as part of SPAR [3].
> 
> Our general conclusion was that David Shotton's proposal (PDF attached) was on target. However, we want to:
> (1) use existing ontologies for references (BIBO? ...?)
> (2) use existing ontologies for the header (PRISM? DC?...?) 
> (3) check the use of fabio: (e.g. for Experimental Protocol)
> (4) check the use dro:
> (5) check the use of sro:
> 
> A few questions regarding DoCO came up. The combination of document components, rhetorical components, rhetorical blocks, and structural patterns confused us. We expected these to be several smaller ontologies. Another question (David, perhaps you can answer this) is why you prefer imports into a larger ontology, as opposed to building an application profile? Rhetorical components, for instance, may already be handled adequately by SALT, SWAN, and ScholOnto.
> 
> We also discussed whether we wanted to get beyond ontologies to also address authoring and/or textmining (with a schema or DTDs drawing from ontologies). Anita pointed out that in our 3 use cases, 1 involves authoring. Further, for authoring we're limited to continguous sections (as opposed to post-hoc rhetorical component detection).
> 
> [1] http://purl.org/spar/doco/
> [2] http://esw.w3.org/HCLSIG/SWANSIOC/Actions/RhetoricalStructure/models/medium
> [3] http://opencitations.wordpress.com/2010/10/14/introducing-the-semantic-publishing-and-referencing-spar-ontologies/
> 
> 
> 
> <Shotton discussion paper on DRO.pdf>
> <doco architecture.png>
Received on Wednesday, 24 November 2010 21:06:48 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:01:00 GMT