- From: Colin Maudry <colin@maudry.com>
- Date: Thu, 02 Oct 2014 16:02:17 +0200
- To: John Walker <john.walker@semaku.com>, Norman Gray <norman@astro.gla.ac.uk>, Luca Matteis <lmatteis@gmail.com>
- CC: Linked Data community <public-lod@w3.org>
- Message-ID: <542D5AE9.9070708@maudry.com>
Hi all, Thanks John for the references to my project. It seems that here you need a solution that both pleases those who want a PDF to comply with existing processes, and those who want a machine-readable format for better Web-accessibility. The DITA <https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=dita> standard is an OASIS standard, like Open Document. It's an XML framework dedicated to the creation of documents via the assembling of content components, the topics. See it as a Docbook evolved. The Wikipedia page <https://en.wikipedia.org/wiki/Darwin_Information_Typing_Architecture> is a good introduction. In the DITA ecosystem, a processing engine has been developed by the community, the DITA Open Toolkit <http://dita-ot.github.io/>. Through its plugin system, it enables the publication of DITA content to a myriad of output formats: * PDF * Simple HTML * HTML WebHelp (fancy example <http://purl.org/dita/ditardf-project>) * ePub and Kindle (through the dita4publisher plugin <http://dita4publishers.sourceforge.net/>) * ...and RDF/XML through the plugin part of the DITA RDF project <http://purl.org/dita/ditardf-project>. The plugin extracts the metadata of the documentation (author, title, creation date, links, variables), not the meaning of the content (output example <https://github.com/ColinMaudry/dita-rdf/blob/ditaot-plugin/dita2rdf/demo/out/ditaot-userguide.rdf>). It could be extended to extract certain facts from the content. DITA has a nice feature: its core vocabulary can be extended via "specialization", so that it can support specific purposes: learning content, troubleshooting documents, etc. Those who want a PDF would make a PDF rendition and those who want machine-readable formats would use a flavour of HTML or give me a hand with the RDF output. What do you think? Colin On 02/10/2014 11:08, John Walker wrote: > Hi All, > > I know Latex is the norm in academic circles, but the DITA XML > standard is widely used in industry and gaining traction in publishing. > > Colin Maudry ( @CMaudry) has a project for extracting RDF metadata > from DITA content [1]. > Seems to be attracting interest from Marklogic and HarperCollins [2] > and others [3]. > > Cheers, > John > > [1] http://purl.org/dita/ditardf-project > [2] http://files.meetup.com/1645603/meetup-2014-08-12.pptx > [3] http://de.slideshare.net/TheresaGrotendorst/towards-dynamic-and-smart-content-semantic-technologies-for-adaptive-technical-documentation > > > > On October 2, 2014 at 12:03 AM Norman Gray <norman@astro.gla.ac.uk> > wrote: > > > > > > > > Greetings. > > > > On 2014 Oct 1, at 22:36, Luca Matteis <lmatteis@gmail.com> wrote: > > > > > So forget PDF. Perhaps we can add markup to Latex documents and make > > > them linked data friendly? That would be cool. A Latex RDF > > > serialization :) > > > > There exists > <http://www.siegfried-handschuh.net/pub/2007/salt_eswc2007.pdf>: > > > > > SALT: Semantically Annotated LATEX Tudor Groza Siegfried Handschuh > Hak Lae Kim > > > > > > Digital Enterprise Research Institute > > > IDA Business Park, Lower Dangan > > > Galway, Ireland > > > {tudor.groza, siegfried.handschuh, haklae.kim}@deri.org > > > > > > ABSTRACT > > > > > > Machine-understandable data constitutes the basis for the Seman- > tic Desktop. We provide in this paper means to author and annotate > Semantic Documents on the Desktop. In our approach, the PDF file > format is the basis for semantic documents, which store both a > document and the related metadata in a single file. To achieve this we > provide a framework, SALT that extends the Latex writ- ing environment > and supports the creation of metadata for scien- tific publications. > SALT lets the scientific author create metadata while putting together > the content of a research paper. We discuss some of the requirements > one has to meet when developing such an ontology-based writing > environment and we describe a usage scenario. > > > > That describes a very thorough approach to embedding some semantics > within LaTeX documents. > > > > Yes, 'thorough'; very thorough; verging on the intimidating. > > > > I dimly recall that there was a rather more lightweight approach > which was used for proceedings in ISWC or ESWC -- I remember marking > up a LaTeX document in something less comprehensive than SALT -- but I > can't remember enough to be able to re-find it. > > > > All the best, > > > > Norman > > > > > > -- > > Norman Gray : http://nxg.me.uk > > SUPA School of Physics and Astronomy, University of Glasgow, UK > > > >
Received on Friday, 3 October 2014 07:52:56 UTC