W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > November 2010

Re: HCLS Scientific Discourse ConCall on Monday Nov 15: DoCO, the Document Components Ontology

From: David Shotton <david.shotton@zoo.ox.ac.uk>
Date: Fri, 12 Nov 2010 18:42:00 +0000
Message-ID: <4CDD8A78.2070001@zoo.ox.ac.uk>
To: Tim Clark <tim_clark@harvard.edu>, HCLS IG <public-semweb-lifesci@w3.org>, Paolo Ciccarese <paolo.ciccarese@gmail.com>, "M. Scott Marshall" <mscottmarshall@gmail.com>, "John F. Madden" <john.madden@duke.edu>, Alberto Accomazzi <aaccomazzi@cfa.harvard.edu>, Sophia Ananiadou <Sophia.Ananiadou@manchester.ac.uk>, Gully Burns <gully@usc.edu>, "Ronald (ELS-SDG) Daniel" <R.Daniel@elsevier.com>, Rahul Dave <rahuldave@gmail.com>, Anita de Waard <A.dewaard@elsevier.com>, Alf Eaton <A.Eaton@nature.com>, Alyssa Goodman <agoodman@cfa.harvard.edu>, Paul Groth <pgroth@gmail.com>, Tudor Groza <tudor.groza@deri.org>, ellen hays <E.Hays@elsevier.com>, "Antony (ELS-CAM) Scerri" <A.scerri@elsevier.com>, Jack Park <jackpark@gmail.com>, Silvio Peroni <speroni@cs.unibo.it>, Philippe Rocca-Serra <proccaserra@googlemail.com>, Karin Verspoor <Karin.Verspoor@ucdenver.edu>, "lynette@mitre.org" <lynette@mitre.org>, Jun Zhao <jun.zhao@zoo.ox.ac.uk>
CC: Susanna-Assunta Sansone <sansone@ebi.ac.uk>
Dear Colleagues,

Silvio and I have been working hard to complete the first release of 
*DoCO, the Document Components Ontology*, a new member of SPAR, the 
Semantic Publishing and Referencing Ontologies (http://bit.ly/9d8qAi), 
in time for Monday's teleconference.  I apologise that we did not 
achieve this sooner, to give you more time to study it.

We have based our choice of DoCO classes on the following sources:

    * Anita's excellent work analysing the varying document structures
      used in journal articles from a variety of STM publishers;
    * Tudor and Sigi's SALT Rhetorical Ontology, SRO
    * terms in the National Library of Medicine DTD, widely used by STM
      publishers and by Pubmed Central for XML markup of documents; and
    * our own common sense and experience as authors and readers of
      research articles and books.

DoCO v1.0, which is now available at http://purl.org/spar/doco/, 
attempts to provide a comprehensive description of the main structural 
and rhetorical components used in scholarly documents, employing as 
class names terms that are familiar to scholars.  It used rhetorical 
terms from the SALT Rhetorical Ontology, to which it adds its own 
rhetorical and structural terms, and imports Silvio's Pattern Ontology 
(http://www.essepuntato.it/2008/12/pattern) to define formal structural 
patterns for segmenting a document into its component parts.

Opening DoCO in a web browser will display a human-readable version of 
the ontology, while opening it in an ontology editor such as Protege 
will display the tree structure of the OWL 2 DL ontology itself.

Additionally, to aid your understanding, we have published an 
architectural diagram of DoCO at http://bit.ly/b383VR, and a set of 
explanatory figures, showing how DoCO terms relate to different parts of 
a recent article from the Journal of Cell Biology, as a PDF document 
available at http://bit.ly/d0VhJr.

We look forward to discussing this with you on Monday.

Kind regards,



Dr David Shotton david.shotton@zoo.ox.ac.uk 
Reader in Image Bioinformatics

Image Bioinformatics Research Group http://ibrg.zoo.ox.ac.uk
Department of Zoology, University of Oxford                  tel: 
South Parks Road, Oxford OX1 3PS, UK                    fax: 
Received on Friday, 12 November 2010 18:42:34 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:52:44 UTC