W3C home > Mailing lists > Public > semantic-web@w3.org > October 2014

Re: scientific publishing process (was Re: Cost and access)

From: Laura Dawson <Laura.Dawson@bowker.com>
Date: Sun, 5 Oct 2014 14:47:39 +0000
To: Ivan Herman <ivan@w3.org>, Daniel Schwabe <dschwabe@inf.puc-rio.br>
CC: W3C Semantic Web IG <semantic-web@w3.org>, W3C LOD Mailing List <public-lod@w3.org>, Phillip Lord <phillip.lord@newcastle.ac.uk>, "Eric Prud'hommeaux" <eric@w3.org>, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, Bernadette Hyland <bhyland@3roundstones.com>
Message-ID: <D056D1EA.82727%laura.dawson@bowker.com>
I think I mentioned previously, Ivan, but perhaps not on this thread -
Hugh McGuire has developed a Wordpress tool called PressBooks which allows
you to write a book in HTML and export it as an EPUB file. He even
supports schema.org markup in a separate plugin.

On 10/5/14, 10:34 AM, "Ivan Herman" <ivan@w3.org> wrote:

>This is not a direct answer to Daniel, but rather expanding on what he
>said. Actually, he and I were (and still are) in the same IW3C2
>committee, ie, we share the experience; and I was one of those (although
>the credit really goes to Bob Hopgood, actually, who was pushing that the
>most) who tried to come up with a proper XHTML template.
>The real problem is still the missing tooling. Authors, even if
>technically savy like this community, want to do what they set up to do:
>write their papers as quickly as possible. They do not want to spend
>their time going through some esoteric CSS massaging, for example. Let us
>face it: we are not yet there. The tools for authoring are still very
>poor. This in spite of the fact that many realize that PDF is really not
>the format for our age; we need much more than a reproduction of a
>printed page digitally (as someone referred to in the thread I really
>suffer when I have to read, let alone review, an article in PDF on my
>But I do see an evolution that might change in the coming years. Laura
>dropped the magic word on the early phases if this thread: ePub. ePub is
>a packaged (zip archived) HTML site, with some additional information. It
>is the format that most of the ebook readers understand (hey, it can even
>be converted into a Kindle format:-). Both Firefox and Chrome have ePub
>reader extensions available and Mac OS comes with a free ebook reader
>(iBook) that is based on it. I expect (hope) that the convergence between
>ePub and browsers will bring these even closer in the coming years.
>Because ePub is a packaged web site, with the core content in HTML5 (or
>SVG), metadata can be added to the content in RDFa, microdata, embedded
>JSON-LD; in fact, metadata can also be added to the archive as a separate
>file so if you are crazy enough you can even add RDF data in RDF/XML (no,
>please, don't do it:-). And, of course, it can be as much as a hypertext
>as you can just master:-)
>Tooling? No, not yet:-( Well, not yet for lambda users. But there, too,
>there is an evolution. The fact is that publishers are working on "XML
>first" (or "HTML first") workflows. O'Reilly's Atlas tool[1] means that
>authors prepare their documents in, essentially, HTML (well, a restricted
>profile thereof), and the output is then produced in EPUB, PDF, or pure
>HTML at the end. Companies are created that do similar things and where
>small(er) publishers can develop full projects (Metrodigi, Inkling,
>Hachette, ...; but I do not think it is possible to use these for a big
>conference, although, who knows?). Importantly to this community, these
>tools also include annotation facilities, akin to MS Word's commenting
>Where does it take us _now_? Much against my instinct and with a bleeding
>heart I have to accept that conferences of the size of WWW, but even ISWC
>or ESWC, cannot reasonably ask their submitters to submit in ePub (or
>HTML). Yet. Not today. It is a chicken and egg problem, and change may
>come only with events, as well as more progressive scholarly publishers,
>experimenting with this. Just like Daniel (and Bernadette) I would love
>to see that happening for smaller workshops (if budget allows, I could
>imagine a workshop teaming up with, say, Metrodigi to produce the
>workshop's proceedings). But I am optimistic that the change will happen
>within a foreseeable time and our community (as any scholarly community,
>I believe) will have to prepare itself for a change in this area.
>Adding my 2˘ to Daniel's:-)
>P.S. For LaTeX users: I guess the main advantage of LaTeX is the math
>part. And this is the saddest story of all: MathML has been around for a
>long time, and it is, actually, part of ePUB as well, but authoring
>proper mathematics is the toughest with the tools out there. Sigh...
>P.S.2 B.t.w., W3C has just started work on Web Annotations. Watch that
>[1] https://atlas.oreilly.com
>[2] http://metrodigi.com
>[3] https://www.inkling.com
>On 04 Oct 2014, at 04:14 , Daniel Schwabe <dschwabe@inf.puc-rio.br> wrote:
>> As is often the case on the Internet, this discussion gives me a
>>terrible sense of dejá vu. We've had this discussion many times before.
>> Some years back the IW3C2 (the steering committee for the WWW
>>conference series, of which I am part) first tried to require HTML for
>>the WWW conference paper submissions, then was forced to make it
>>optional because authors simply refused to write in HTML, and eventually
>>dropped it because NO ONE (ok, very very few hardy souls) actually sent
>>in HTML submissions.
>> Our conclusion at the time was that the tools simply were not there,
>>and it was too much of a PITA for people to produce HTML instead of
>>using the text editors they are used to. Things don't seem to have
>>changed much since.
>> And this is simply looking at formatting the pages, never mind the
>>whole issue of actually producing hypertext (ie., turning the article's
>>text into linked hypertext), beyond the easily automated ones (e.g.,
>>links to authors, references to papers, etc..). Producing good
>>hypertext, and consuming it, is much harder than writing plain text. And
>>most authors are not trained in producing this kind of content. Making
>>this actually "semantic" in some sense is still, in my view, a research
>>topic, not a routine reality.
>> Until we have robust tools that make it as easy for authors to write
>>papers with the advantages afforded by PDF, without its shortcomings, I
>>do not see this changing.
>> I would love to see experiments (e.g., certain workshops) to try it out
>>before making this a requirement for whole conferences.
>> Bernadette's suggestions are a good step in this direction, although I
>>suspect it is going to be harder than it looks (again, I'd love to be
>>proven wrong ;-)).
>> Just my personal 2c
>> Daniel
>> On Oct 3, 2014, at 12:50  - 03/10/14, Peter F. Patel-Schneider
>><pfpschneider@gmail.com> wrote:
>>> In my opinion PDF is currently the clear winner over HTML in both the
>>>ability to produce readable documents and the ability to display
>>>readable documents in the way that the author wants them to display.
>>>In the past I have tried various means to produce good-looking HTML and
>>>I've always gone back to a setup that produces PDF.  If a document is
>>>available in both HTML and PDF I almost always choose to view it in
>>>PDF.  This is the case even though I have particular preferences in how
>>>I view documents.
>>> If someone wants to change the format of conference submissions, then
>>>they are going to have to cater to the preferences of authors, like me,
>>>and reviewers, like me.  If someone wants to change the format of
>>>conference papers, then they are going to have to cater to the
>>>preferences of authors, like me, attendees, like me, and readers, like
>>> I'm all for *better* methods for preparing, submitting, reviewing, and
>>>publishing conference (and journal) papers.  So go ahead, create one.
>>>But just saying that HTML is better than PDF in some dimension, even if
>>>it were true, doesn't mean that HTML is better than PDF for this
>>> So I would say that the semantic web community is saying that there
>>>are better formats and tools for creating, reviewing, and publishing
>>>scientific papers than HTML and tools that create and view HTML.  If
>>>there weren't these better ways then an HTML-based solution might be
>>>tenable, but why use a worse solution when a better one is available?
>>> peter
>>> On 10/03/2014 08:02 AM, Phillip Lord wrote:
>>> [...]
>>>> As it stands, the only statement that the semantic web community are
>>>> making is that web formats are too poor for scientific usage.
>>> [...]
>>>> Phil
>> Daniel Schwabe                      Dept. de Informatica, PUC-Rio
>> Tel:+55-21-3527 1500 r. 4356        R. M. de S. Vicente, 225
>> Fax: +55-21-3527 1530               Rio de Janeiro, RJ 22453-900, Brasil
>> http://www.inf.puc-rio.br/~dschwabe
>Ivan Herman, W3C 
>Digital Publishing Activity Lead
>Home: http://www.w3.org/People/Ivan/
>mobile: +31-641044153
>GPG: 0x343F1A3D
>WebID: http://www.ivan-herman.net/foaf#me
Received on Sunday, 5 October 2014 14:48:11 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:49:25 UTC