Fwd: html for scholarly communication: RASH, Scholarly HTML or Dokieli?

On Sun, Sep 10, 2017 at 12:04 AM, Sarven Capadisli <info@csarven.ca> wrote:

> On 2017-09-09 22:48, Johannes Wilm wrote:
> > The formats that focus on a limited tag-set have been developed already
> > (RASH and Scholarly HTML) may have just about everything we need
> > already.
> It certainly does not, and that's part of the issue here.
> Scholarly HTML doesn't set that constraint. RASH has the following 32
> elements:
> a, blockquote, body, code, em, figcaption, figure, h1, head, html, img,
> li, link, math, meta, ol, p, pre, q, script, section, span, strong, sub,
> sup, svg, table, td, th, title, tr, ul

These look very similar to what we produce currently.

> Looking at that list, it seems predominantly a *print first* approach,
> not "Web first"! In 2015 it was about 25 elements, and that was
> certainly all one needed. So much for that.
> The last thing SH would want to respond to the scholarly community is
> something like "`video`? Sorry that's not allowed. Please align your
> perception of scholarly information on the Web with ours (circa 2017)."

How about a base format that can be used both for presentations on the net,
and be converted to epub and and print/PDF formats, and a "multimedia
extension" that is targeting only the web? I think under all circumstances
we need a format that can be converted to other output format, such as
print or light-weight ebooks that cannot contain entire movies, and that
therefore comes without multimedia beyond static images/SVGs, as those
usecases continue to exist also in the age of the web.

As for allowing arbitrary elements: I tried copying an article from the
Guardian into a Dokieli editor. And like any other JS editor, it did some
basic cleaning of the soup of incoming HTML: It stripped the CSS classes
and a Video on Twitter turned into just a black image.  Still, it left a
lot of attributed that probably had a meaning for the JavaScript on the
Guardian webpage, but made little sense on the Dokieli page. Multiple
stacked DIVs also meant that my caret moved around strangely and just by
looking at the on-screen tools there I could not find anything to for
example remove or merge some of these DIVs.

These are the kinds of problems one aims to remove by using a standardized
subset for scientific articles.

> That exact line of reasoning holds true for any given element or
> arbitrary constraint on top of the *living* HTML spec.
> Again, authors will want to do things beyond what SH could possibly
> capture, or the CG can plan for. Plenty of skills in this CG, but let's
> not forget that we are only a vocal minority. I suggest that we do not
> prematurely think we got scholarly information covered by way of x
> elements or whatever.

One will never cover everything. But one could cover a large number of
usecases and provide a set of common tools to deal with files in that
format. For those who don't want to use it and don't need the tools --
great, HTML5 is always there.

On Sun, Sep 10, 2017 at 12:50 AM, Silvio Peroni <silvio.peroni@unibo.it>

> Hi Johannes,
> It would be different if it was a standard backed by a standardization
> organization and extensively discussed between different parties. As Robin
> pointed out, a lot of choices will be arbitrary, and the reasoning behind
> everything is not always immediately visible. So had this been a standard
> coming out of such a process, most would likely follow it anyway, no matter
> whether they agree with the logic, they don't or they do not care.
> That’s why we are all here.

That is what I thought, and I am glad you guys are willing to contribute
with all the experiences you have gathered with RASH already.

Johannes Wilm
Fidus Writer

Received on Sunday, 10 September 2017 07:32:09 UTC