Re: scientific publishing process (was Re: Cost and access) from Kingsley Idehen on 2014-10-06 (semantic-web@w3.org from October 2014)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Mon, 06 Oct 2014 11:17:33 -0400
To: W3C LOD Mailing List <public-lod@w3.org>
CC: W3C Semantic Web IG <semantic-web@w3.org>
Message-ID: <5432B28D.1020508@openlinksw.com>
On 10/6/14 10:25 AM, Paul Houle wrote:
> Frankly I don't see the reason for the hate on PDF files.
>
> I do a lot of reading on a tablet these days because I can take it to 
> the gym or on a walk or in the car.  Network reliability is not 
> universal when I leave the house (even if I had a $10 a GB LTE plan) 
> so downloaded PDFs are my document format of choice.
>
> There might be a lot of hypothetical problems with PDFs,  and I am 
> sure there is a better way to view files on a small screen,  but 
> practically I have no trouble reading papers from arXiv.org,  books 
> from oreilly.com <http://oreilly.com>,  be these produced by a 
> TeX-derived or Word-derived toolchains or a toolchain that involves a 
> real page layout tool for that matter.

Paul,

As I see it, the issue here is more to do with PDF being the only 
option, rather than no PDFs at all. Put differently, we are not using 
our "horses for course" technology (the Web that emerges from AWWW 
exploitation) to produce "horses for course" conference artifacts. 
Instead, we continue to impose (overtly or covertly) specific options 
that are contradictory, and of diminishing value.

Conferences (associated with themes like Semantic Web and Linked Open 
Data) should accept submissions that provide open access to relevant 
research data. In a sense, imagine if PDFs where submitted without 
bibliographic references. Basically, that's what happening here with 
research data circa. 2014, where we have a functioning Web of Linked 
(Open) Data, which is based on AWWW.

Loosely coupling the print-friendly documents (PDFs, Latex etc.), 
http-browser friendly documents (HTML), and actual raw data references 
(which take the form of 5-Star Linked Open Data ) is a practical staring 
point. Adding experiment workflow (which is also becoming the norm in 
the bio informatics realm) is a nice bonus, as already demonstrated by 
examples provided by Hugh Glaser (see: this weekend's thread).

Kingsley



>
>
>
> On Sun, Oct 5, 2014 at 5:43 PM, Mark Diggory <mdiggory@atmire.com 
> <mailto:mdiggory@atmire.com>> wrote:
>
>
>     On Sun, Oct 5, 2014 at 2:39 PM, Mark Diggory <mdiggory@atmire.com
>     <mailto:mdiggory@atmire.com>> wrote:
>
>         Hello Community,
>
>         On Sun, Oct 5, 2014 at 1:19 PM, Luca Matteis
>         <lmatteis@gmail.com <mailto:lmatteis@gmail.com>> wrote:
>
>             On Sun, Oct 5, 2014 at 4:34 PM, Ivan Herman <ivan@w3.org
>             <mailto:ivan@w3.org>> wrote:
>             > The real problem is still the missing tooling. Authors,
>             even if technically savy like this community, want to do
>             what they set up to do: write their papers as quickly as
>             possible. They do not want to spend their time going
>             through some esoteric CSS massaging, for example. Let us
>             face it: we are not yet there. The tools for authoring are
>             still very poor.
>
>             But are they still very poor? I mean, I think there are
>             more tools for
>             rendering HTML than there are for rendering Latex. In fact
>             there are
>             probably more tools for rendering HTML than anything else
>             out there,
>             because HTML is used more than anything else. Because HTML
>             powers the
>             Web! 
>
>
>             You can write in Word, and export in HTML. You can write
>             in Markdown
>             and export in HTML. You can probably write in Latex and
>             export in HTML
>             as well :) 
>
>
>             The tools are not the problem. The problem to me is the
>             printing
>             afterwords. Conferences/workshops need to print the
>             publications.
>             Printing consistent Latex/PDF templates is a lot easier
>             than printing
>             inconsistent (layout wise) HTML pages.
>
>
>         There are tools, for example, theres already a bit of work to
>         provide a plugin for semantic markup in Microsoft Word
>         (https://ucsdbiolit.codeplex.com/) and similar efforts on the
>         Latex side (https://trac.kwarc.info/sTeX/)
>
>         But, this is not a question of technology available to
>         authors, but of requirements defined by publishers. If authors
>         are too busy for this effort, then publishers facilitate that
>         added value when it is in their best interest.
>
>         For example, PLoS has a published format guidelines using Work
>         and Latex (http://www.plosone.org/static/guidelines), a
>         workflow for semantically structuring their resulting output
>         and their final output is well structured and available in XML
>         based on a known standard
>         (http://dtd.nlm.nih.gov/publishing/3.0/journalpublishing3.dtd), PDF
>         and the published HTML on their website
>         (http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0011233).
>
>         This results In semantically meaningful XML that is
>         transformed to HTML
>
>         http://www.plosone.org/article/fetchObjectAttachment.action?uri=info%3Adoi%2F10.1371%2Fjournal.pone.0011233&representation=XML
>
>         Clearly the publication process can support solutions and when
>         its in the best interest of the publisher. They will adopt and
>         drive their own markup processes to meet external demand.
>
>         Providing tools that both the publisher and the author may use
>         independently could simplify such an effort, but is not a main
>         driver in achieving that final result you see in PLoS. This is
>         especially the case given that both file formats and efforts
>         to produce the "ideal solution" are inherently localized,
>         competitive and diverse, not collaborative in nature. For
>         PLoS, the solution that is currently successful is the one
>         that worked to solve todays immediate local need with todays
>         tools, not the one that was perfectly designed to meet all
>         tomorrows hypothetical requirements.
>
>         Cheers,
>         Mark Diggory
>
>         p.s. Finally, on the reference of moving repositories such as
>         EPrints and DSpace towards supporting semantic markup of their
>         contents. Being somewhat of a participant in LoD on the DSpace
>         side, I note that these efforts are inherently just
>         "Repository Centric", describing the the structure of the
>         repository (IE collections of files), not the semantic
>         structure contained within those files (ideas, citations,
>         formulas, data tables, figures). In both cases, these
>         capabilities are in their infancy and without any strict
>         format and content driven publication workflow, and lacking
>         any rendering other than to offer the file for download, they
>         ultimately suffer from the same need for a common Semantic
>         Document format that can be leveraged for rendering,
>         referencing and indexing.
>
>
>         -- 
>         @mire Inc.
>          *Mark Diggory*
>         /2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010/
>         /Esperantolaan 4, Heverlee 3001, Belgium/
>         http://www.atmire.com <http://www.atmire.com/>
>
>
>
>
>     -- 
>     @mire Inc.
>      *Mark Diggory*
>     /2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010/
>     /Esperantolaan 4, Heverlee 3001, Belgium/
>     http://www.atmire.com <http://www.atmire.com/>
>
>
>
>
> -- 
> Paul Houle
> Expert on Freebase, DBpedia, Hadoop and RDF
> (607) 539 6254    paul.houle on Skype ontology2@gmail.com 
> <mailto:ontology2@gmail.com>
> http://legalentityidentifier.info/lei/lookup


-- 
Regards,

Kingsley Idehen 
Founder & CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: http://kingsley.idehen.net/dataspace/person/kidehen#this
Attachments

application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Monday, 6 October 2014 15:17:56 UTC