W3C home > Mailing lists > Public > public-scholarlyhtml@w3.org > September 2017

Re: html for scholarly communication: RASH, Scholarly HTML or Dokieli?

From: Johannes Wilm <johanneswilm@gmail.com>
Date: Sat, 9 Sep 2017 22:48:36 +0200
Message-ID: <CABkgm-TarxPvzuQW5cCorZus=aSW9O-z3mMd8GPturF8tThdaQ@mail.gmail.com>
To: Sarven Capadisli <info@csarven.ca>
Cc: Scholarly HTML community group <public-scholarlyhtml@w3.org>
The problem with having full HTML5 is that there are way too many tags,
attributes, css ways of doing things and others ways off adding meaning
that may seem clear to a human reader but very difficult to understand for
a computer. Anyone who wants to reuse the data has to fish in a soup of
HTML to try to find out what is going on.

All the wysiwyg-ish editors I have come across had to only accept a small
number of tag/attribute combinations for the entire to be viable.
Apparently Dokieli is able to handle any kind of HTML, but I wouldn't be
able to create an editor that is able to do that, and I cannot see how that
HTML can be converted into other formats without more human intervention.

I really don't think creating a common format is impossible. Over the past
two years we have worked with mainly IT-focused scientists on Fidus Writer.
Even though FW was made with social sciences/humanities in mind, the
IT-academics had very few problems writing their papers on it, and several
of them have been published.

Yes, there were occasional minor "cultural" issues (such as IT-people not
getting why one would reference several works in the same citation), and of
course there are always more things that would be nice to have, but we're
far from needing 500 new tags just to start.

I don't know this, but maybe part of the reason it is easier now to fibd a
solution than 30 years ago is that publishers are more willing to make
compromises in terms of layout as they are no longer just dealing with
print and they are facing academics shopping around for lower APCs?

I think the proposed way forward of creating something that generally works
for the sciences and then allowing further specification to allow very
subject-specific things via rdfa or alike sounds good.

The formats that focus on a limited tag-set have been developed already
(RASH and Scholarly HTML) may have just about everything we need already.
The reason why it would seem to me to be a good thing to look at combining
efforts is that RASH seems like so far has been developed by just one
organization (and some not entirely intuitive decisions have been made on
for example banning h2-h6 and adding author information to the head and not
the body), while ScholarlyHTML sounds like it's lacking the manpower and
openly available tooling around it.

It sounds to me like Dokieli is heading a different direction, where it's
less important to reuse the text in other applications and the web is the
last place where the document is landing. That is probably working for a
lot of people, but not for everyone.

We already have students work on the dokieli export filter, and we'll
support them with that, but in addition I think it would be good to have an
exporter to a limited HTML that can be processed and transformed into other
formats. That's what I hope comes out of this process.

On 9 Sep 2017 9:45 pm, "Sarven Capadisli" <info@csarven.ca> wrote:

On 2017-09-09 21:08, Bruce Miller wrote:
> It would seem more productive to focus on the metadata magic
> (RDFa, Aria,...) to indicate the data's meaning & purpose,
> rather than to try to make HTML actually semantic after-the-fact.

I agree completely.

It is precisely the approach that dokieli takes with RDF(a) meanwhile
making the best of what HTML5 has to offer.

Received on Saturday, 9 September 2017 20:49:02 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:13:01 UTC