Re: Some Design Principles

Hi Ivan, Robin, all,

> […] at least in 2015, most of those journals are still bound by formatting rules that are, though antiquated, nevertheless prevalent (the ACM or Springer formats are probably the best known examples in computer science or mathematics). This means that any environment based on SH can be successful only if it is possible to produce, through some clever software, HTML *as well as* PDF formats that abide to those rules. Similarly, authors still use Microsoft Word, mostly, to author their articles and tools must exist to convert those into SH. 

I totally agree with Ivan here.

> […] But I think keeping this in our back of our mind all the time is important. Silvio's RASH format is a good example for such a full(er) environment, but I let him comment on the details.

RASH, indeed, is just the core building block (i.e., the HTML interchange format, let’s say) of a large Framework which mainly includes tools for converting from existing formats (typically used in the publishing workflow) into RASH and from RASH into such existing formats. In our view, this was needed in order to get RASH acceptable as HTML-based format for scholarly papers by the various actors of the publishing process, mainly the authors and the publishers (but even reviewers, software agents, and the like).

The main motivation behind this is that you cannot ask the actors involved to give up with their usual ways of producing and processing scholarly papers, forcing the use of any SH-ish format instead. As Robin already highlighted, word-processors (mainly Microsoft Word) are the main tools used by authors for writing papers (with some exceptions, of course). Publishers (at least those I usually talk with) have a pipeline based on LaTeX and/or XML sources. Thus, the only way to convince them in adopting HTML-based formats for scholarly submissions is to help them in their job by proposing approaches that work well with already-extablished pipelines.

To this end, we have spent efforts in developing mechanisms (mainly based on XSLT documents) for automatising such from/into RASH conversions (for instance, see http://dasplab.cs.unibo.it/rocs <http://dasplab.cs.unibo.it/rocs> for ODT-to-RASH and RASH-to-LaTeX conversions, while we are currently working on Word-to-RASH converter). These conversion mechanisms, I believe, have been the real reason that allowed us to concretely experiment the use of RASH in scholarly venues – for instance, see the SAVE-SD workshop (http://cs.unibo.it/save-sd/2016/index.html, held during WWW) and the coming ESWC 2016 Conference (http://2016.eswc-conferences.org/call-papers).

The idea is that having SH is not enough for making it used and quickly integrated in existing pipelines. We should work, in some way, also in proposing tools (e.g., XSLTs, CSSs) for simplifying its adoption in a very broad context and according to the needs of all the involved actors. Of course, the development of such tools is easier if SH will be (a) enough expressive for modelling what we consider important, (b) simple and (c) clear – as PeterMR suggested in https://lists.w3.org/Archives/Public/public-scholarlyhtml/2015Dec/0017.html <https://lists.w3.org/Archives/Public/public-scholarlyhtml/2015Dec/0017.html>.

> (A good example from my own experience: I am part of the steering committee for the WWW201X conferences. Our 'proceedings' has also been published by the ACM, in their digital library, for many years, although we also maintain the proceedings, free of charge for everybody, on the Web. We would like to have a purely HTML based proceedings eventually, and we are seriously considering doing that for WWW2017. But it is clear for us that having a copy of the articles in the ACM DL is important for our constituency.)

From this side, I think that the Semantic Web community is very keen in experimenting with SH-ish formats, due to the recent discussions on HTML submissions related with the two main conferences of the community, i.e., ESWC and ISWC.

> 3. The issue of archiving came up. I think we should also seriously consider, from the start, that an SH, more exactly the SH plus the surrounding information, should also be storable, for offline usage *and* archiving, in EPUB of some version. Doing that means we can provide a proper offline usage for the paper (and forget about PDF in that respect) as well as ensure a certain level of defense against link-rot.

This is something we are actually studying for RASH, and I hope to share the results of this experimentation soon with all of you.

> EPUB 3 definitely has some restrictions on what can or cannot be included and what format can be used; we should know that. Note also that the W3C DPUB IG is working on a more general vision, a *draft* called Portable Web Publications[2] that may be the right environment to consider in the future. Again: we should keep that approach for archiving in mind…

Totally agree with this vision, and thanks for sharing [2].

Have a nice day :-)

S.





----------------------------------------------------------------------------
Silvio Peroni, Ph.D.
Department of Computer Science and Engineering
University of Bologna, Bologna (Italy)
Tel: +39 051 2094871
E-mail: silvio.peroni@unibo.it
Web: http://www.essepuntato.it
Twitter: essepuntato

Received on Wednesday, 2 December 2015 10:35:45 UTC