W3C home > Mailing lists > Public > semantic-web@w3.org > October 2014

Re: scientific publishing process (was Re: Cost and access)

From: Paul Houle <ontology2@gmail.com>
Date: Mon, 6 Oct 2014 10:25:55 -0400
Message-ID: <CAE__kdTULbtz4CJi_P2s8k+kh=X6Kh3uBSsPi+TAJy9tvYmxdA@mail.gmail.com>
To: Mark Diggory <mdiggory@atmire.com>
Cc: Luca Matteis <lmatteis@gmail.com>, Ivan Herman <ivan@w3.org>, Daniel Schwabe <dschwabe@inf.puc-rio.br>, W3C Semantic Web IG <semantic-web@w3.org>, W3C LOD Mailing List <public-lod@w3.org>, Phillip Lord <phillip.lord@newcastle.ac.uk>, "Eric Prud'hommeaux" <eric@w3.org>, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, Bernadette Hyland <bhyland@3roundstones.com>
Frankly I don't see the reason for the hate on PDF files.

I do a lot of reading on a tablet these days because I can take it to the
gym or on a walk or in the car.  Network reliability is not universal when
I leave the house (even if I had a $10 a GB LTE plan) so downloaded PDFs
are my document format of choice.

There might be a lot of hypothetical problems with PDFs,  and I am sure
there is a better way to view files on a small screen,  but practically I
have no trouble reading papers from arXiv.org,  books from oreilly.com,  be
these produced by a TeX-derived or Word-derived toolchains or a toolchain
that involves a real page layout tool for that matter.

On Sun, Oct 5, 2014 at 5:43 PM, Mark Diggory <mdiggory@atmire.com> wrote:

> On Sun, Oct 5, 2014 at 2:39 PM, Mark Diggory <mdiggory@atmire.com> wrote:
>> Hello Community,
>> On Sun, Oct 5, 2014 at 1:19 PM, Luca Matteis <lmatteis@gmail.com> wrote:
>>> On Sun, Oct 5, 2014 at 4:34 PM, Ivan Herman <ivan@w3.org> wrote:
>>> > The real problem is still the missing tooling. Authors, even if
>>> technically savy like this community, want to do what they set up to do:
>>> write their papers as quickly as possible. They do not want to spend their
>>> time going through some esoteric CSS massaging, for example. Let us face
>>> it: we are not yet there. The tools for authoring are still very poor.
>>> But are they still very poor? I mean, I think there are more tools for
>>> rendering HTML than there are for rendering Latex. In fact there are
>>> probably more tools for rendering HTML than anything else out there,
>>> because HTML is used more than anything else. Because HTML powers the
>>> Web!
>>> You can write in Word, and export in HTML. You can write in Markdown
>>> and export in HTML. You can probably write in Latex and export in HTML
>>> as well :)
>>> The tools are not the problem. The problem to me is the printing
>>> afterwords. Conferences/workshops need to print the publications.
>>> Printing consistent Latex/PDF templates is a lot easier than printing
>>> inconsistent (layout wise) HTML pages.
>> There are tools, for example, theres already a bit of work to provide a
>> plugin for semantic markup in Microsoft Word (
>> https://ucsdbiolit.codeplex.com/) and similar efforts on the Latex side (
>> https://trac.kwarc.info/sTeX/)
>> But, this is not a question of technology available to authors, but of
>> requirements defined by publishers. If authors are too busy for this
>> effort, then publishers facilitate that added value when it is in their
>> best interest.
>> For example, PLoS has a published format guidelines using Work and Latex (
>> http://www.plosone.org/static/guidelines), a workflow for semantically
>> structuring their resulting output and their final output is well
>> structured and available in XML based on a known standard (
>> http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd), PDF and
>> the published HTML on their website (
>> http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0011233
>> ).
>> This results In semantically meaningful XML that is transformed to HTML
>> http://www.plosone.org/article/fetchObjectAttachment.action?uri=info%3Adoi%2F10.1371%2Fjournal.pone.0011233&representation=XML
>> Clearly the publication process can support solutions and when its in the
>> best interest of the publisher. They will adopt and drive their own markup
>> processes to meet external demand.
>> Providing tools that both the publisher and the author may use
>> independently could simplify such an effort, but is not a main driver in
>> achieving that final result you see in PLoS. This is especially the case
>> given that both file formats and efforts to produce the "ideal solution"
>> are inherently localized, competitive and diverse, not collaborative in
>> nature. For PLoS, the solution that is currently successful is the one that
>> worked to solve todays immediate local need with todays tools, not the one
>> that was perfectly designed to meet all tomorrows hypothetical requirements.
>> Cheers,
>> Mark Diggory
>> p.s. Finally, on the reference of moving repositories such as EPrints and
>> DSpace towards supporting semantic markup of their contents. Being somewhat
>> of a participant in LoD on the DSpace side, I note that these efforts are
>> inherently just "Repository Centric", describing the the structure of the
>> repository (IE collections of files), not the semantic structure contained
>> within those files (ideas, citations, formulas, data tables, figures). In
>> both cases, these capabilities are in their infancy and without any strict
>> format and content driven publication workflow, and lacking any rendering
>> other than to offer the file for download, they ultimately suffer from the
>> same need for a common Semantic Document format that can be leveraged for
>> rendering, referencing and indexing.
>> --
>> [image: @mire Inc.]
>> *Mark Diggory*
>> *2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010*
>> *Esperantolaan 4, Heverlee 3001, Belgium*
>> http://www.atmire.com
> --
> [image: @mire Inc.]
> *Mark Diggory*
> *2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010*
> *Esperantolaan 4, Heverlee 3001, Belgium*
> http://www.atmire.com

Paul Houle
Expert on Freebase, DBpedia, Hadoop and RDF
(607) 539 6254    paul.houle on Skype   ontology2@gmail.com
Received on Monday, 6 October 2014 14:26:26 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:49:25 UTC