W3C home > Mailing lists > Public > public-scholarlyhtml@w3.org > September 2017

RE: html for scholarly communication: RASH, Scholarly HTML or Dokieli?

From: Gareth Oakes <goakes@gpsl.co>
Date: Sun, 10 Sep 2017 11:20:52 +0000
To: Scholarly HTML community group <public-scholarlyhtml@w3.org>
Message-ID: <DM3PR17MB08892CE0E01862E32536D762C46B0@DM3PR17MB0889.namprd17.prod.outlook.com>
Hi all,

Apologies for the intrusion on this fascinating discussion. I’d almost forgotten I’d subscribed to this list :) I just wanted to make some observations in the hope they drive the conversation forwards.

My perspective is from a technology specialist/consultant that deals with many publishers, including a strong history with scholarly publishing (in particular, STM books/journals). I’m historically an XML guy.

I will quote from Johannes, Peter, Sarven, and Silvio below as they raise the points I feel are salient.

PMR> So what do we want SH-CG for?
This is the crux of it. A “Scholary HTML” format can mean so many things to so many people, what is the essence of what is to be achieved, and how can that be distilled down to a mission statement and direction? I don’t have an answer. Peter’s view:
PMR> My requirement is simple to state. There are (my own figures from analysing CrossRef over several months) ca 7000 "articles" published a day from 500 publishers and "most" have "HTML". So what I want is to be able to read this into my machine without having to have 200+ formats and 200+ semantics.
That is valid. My own view is that this all has to be ultimately tied to the researchers who are the primary participants in this industry. Requirements should be stated from this view, IMHO. E.g. “As a Researcher, I would like a convenient and simple way of storing research notes, results, and data and sharing this with others”. “As a Researcher, I strive to produce and communicate novel, reproducible science.” In their roles as authors, reviewers, readers, etc.

I digress. My point is that a range of valid views exist, and how are these to be accounted for? The final nuance on all of this is that the effort will necessarily be a hobby unless commercial interests are considered.

JW> … like Zotero and Mendeley …

These comments resonate with my view from the Researcher’s perspective. Any successful initiative should “seamlessly” integrate into the world they live in. Reference managers are often a cornerstone of that world. Other random things that spring to mind: ORCID, FundRef, MS Word, TeX/LaTeX, ChemDraw, etc.

SP> … plausible standard HTML format for describing scholarly articles …
SP> … have this SH-CG sorted out in a way that all the choices done are convincing and justified …

Fully agree with these statements, but describing for who? and for what purpose? The needs related to research, authoring, review, publishing, and reuse/referencing of research are sometimes at odds with each other.

I’m being too general, but I feel that a solid answer or consensus on the initial question from my first response could help drive the overall and technical direction of a solution.

JW> … in-text citations rather than footnotes …

A great example of a perennial problem area in markup of research articles. The simplest solution is to ignore references, or tag only key pieces, but what is the correct approach? (Depends on the answer to “what do we want SH-CG for?”)

There are many other problem areas, of these some are more important in certain fields than others. e.g. markup of complex equations is not usually a problem in Humanities papers, but can pose serious challenges for Chemistry or Physics papers. Geographical papers would want special consideration for 2D/3D map data. Chemistry papers ideally need a way of tagging reactions or even chemical names. etc. etc.

SC> How do you make the distinction between a) single element b) "32 elements" (or "25 elements", circa 2015), c) any number of applicable elements at the discretion of the author since they precisely know what to encapsulate?

I’ve considered this a number of times in the past and my personal view is that a layered solution is needed. If you are familiar with DITA, there is an analogy with base topics and specializations. I landed on the idea of a generic “Scholarly HTML” format, that has extensions specific to each field of research or branch of science. I haven’t thought past that point.

Gareth Oakes

From: Silvio Peroni [mailto:silvio.peroni@unibo.it]
Sent: Sunday, 10 September 2017 8:50 AM
To: Johannes Wilm <johanneswilm@gmail.com>
Cc: Scholarly HTML community group <public-scholarlyhtml@w3.org>
Subject: Re: html for scholarly communication: RASH, Scholarly HTML or Dokieli?

Hi Johannes,

It would be different if it was a standard backed by a standardization organization and extensively discussed between different parties. As Robin pointed out, a lot of choices will be arbitrary, and the reasoning behind everything is not always immediately visible. So had this been a standard coming out of such a process, most would likely follow it anyway, no matter whether they agree with the logic, they don't or they do not care.

That’s why we are all here. I’m not part of this CG for saying that we all need to use RASH. Quite the opposite. RASH is my bag of experience, but it is not a standard in the true term, it is a format within a project, if you prefer, that brings with it additional tools for visualising / converting it.

In case it is not clear, what I would like to reach as part of this CG is a plausible standard HTML format for describing scholarly articles, which should be developed hearing opinions of multiple actors and deciding democratically as a community. This doesn’t mean RASH or any other existing HTML format, as far as I can see from the discussion. It will be a new “spec” that will inherit several parts from several existing HTML-based formats / patterns for sure.

To me, the goal of this CG is to have this SH-CG sorted out in a way that all the choices done are convincing and justified. Then we will talk about extending existing tools according to this new “spec” – including the RASH Framework, for what I can say.

A side note: I used the work “spec”, but SH-CG would not be a real W3C Recommendation, rather a CG outcome.

Have a nice day :-)


Silvio Peroni, Ph.D.
Department of Computer Science and Engineering
University of Bologna, Bologna (Italy)
Tel: +39 051 2095393
E-mail: silvio.peroni@unibo.it<mailto:silvio.peroni@unibo.it>
Web: https://www.unibo.it/sitoweb/silvio.peroni/en
Twitter: essepuntato

Received on Sunday, 10 September 2017 11:21:20 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:13:01 UTC