Re: html for scholarly communication: RASH, Scholarly HTML or Dokieli? from Sarven Capadisli on 2017-09-10 (public-scholarlyhtml@w3.org from September 2017)

From: Sarven Capadisli <info@csarven.ca>
Date: Sun, 10 Sep 2017 11:35:26 +0200
To: public-scholarlyhtml@w3.org
Message-ID: <28257375-74f8-9783-dfed-1ee7011fc72f@csarven.ca>
On 2017-09-10 09:49, Ivan Herman wrote:
> I am afraid we are engaging in some sort of theoretical discussion here
> which will never end: do we want to use the full of HTML5 or do we want
> to define a smaller structure by restricting to a subset of HTML5? I
> would think that we would be a bit ahead of this after the experiment
> Benjamin proposed: let us take a few real articles from various fields
> and see how the score with the RASH and SH; it will become easier to
> have an idea.
> 
> Maybe one more step would be, for each of those to also see how easy it
> would be for some of these articles to be formatted via CSS (or maybe
> CSS+JavaScript) to the formats that are in use (ACM, IEEE, etc). I am
> particularly worried about the incredible differences in article
> reference formats out there, and how could one author a paper so that
> the content could be adapted to any existing requirements (there is a
> reason why BiBTex is a separate engine to LaTeX…)
Hi Ivan,

have you come across dokieli?

Allow me to introduce it to you in context of CSS. First, see some of
dokieli's HTML patterns:

https://dokie.li/docs#html-patterns

It is used for the following different kinds of documents, all with
different primary stylesheets (including print). Alternative stylesheets
can be triggered from the dokieli menu or through supporting
user-agents. There is *no* JavaScript requirement for the user-agent to
get a hold of the "data".

Articles:
* http://csarven.ca/dokieli-rww (scholarly article with dynamic annotations)
* http://csarven.ca/cooling-down-web-science (a pretty blog post)
* https://dokie.li/ (a webpage)
* https://linkedresearch.org/ (another webpage)
* https://www.w3.org/TR/ldn/ (a W3C specification)
* https://rhiaro.github.io/thesis/chapter1 (a thesis chapter)
* http://ceur-ws.org/Vol-1549/ (a workshop proceeding)
* http://semstats.org/2016/call-for-contributions (call for "papers")
* https://dokie.li/acm-sigproc-sp (ACM Authoring guidelines)
* https://dokie.li/lncs-splnproc (Springer/LNCS)
* https://data.gov.ie/strategy (Ireland's open data strategy)

Annotations:
* As you well know examples in https://www.w3.org/TR/annotation-html/
derived from dokieli's patterns.

Notifications:
* eg.
https://linkedresearch.org/annotation/csarven.ca/dokieli-rww/b6738766-3ce5-4054-96a9-ced7f05b439f

Plenty more at:

https://github.com/linkeddata/dokieli/wiki#examples-in-the-wild

with different scholarly articles containing various scholarly information.

Happy to report that we've covered a wider range of "scholarly
information" than anything else on the table here. If that's an
incorrect assumption, people can come forward with URLs to existing work.

So, are the HTML patterns documented flexible enough to handle different
cases, including scholarly information? Evidence suggests it to be the
case. Happy to improve where necessary as always. It is not bullet
proof. The patterns have come to a point (certainly not the end) where a
range of things can be expressed without arbitrary or artificial
constraints set. So, I'm having a hard time buying the argument for any
subsetting unless one has the intention for the information to work
*only* under certain 1) tools and 2) versions - let's face it, the
minute we draw the line what's allowed and not allowed, that has to be
dealt with straight on.

For dokieli (in case the /docs is a boring read, nor do I expect anyone
to read it, ... at the risk of repeating myself):

* Information is human and machine-readable to the greatest extent possible.
* Consuming core information does not require JavaScript and gives the
lowest barrier for any consuming agent. Heck, try it out with links/lynx
and compare it with whatever is brought to table in this mailing list.
* dokieli's intention is to allow the expressing HTML as accurately as
possible (either by hand or as much as the UI allows), and put focus on
RDF(a) for data/information exchange.

What do the alternative "formats" do beyond *only* working with the
frameworks they are capable of working within? Not human and
machine-readable as they could be, that's for certain, in my opinion.

If you are all stuck on having a "formal" "format" or whatever, I'll
write a grammar for it and we can discuss that. How about that?

PS: I sincerely apologise for the repetition (and probably the tone),
but I feel that I'm probably not making my points clear enough. So, I
guess I'll back off the mailing list "for a bit" :)

Bon weekend,

-Sarven
http://csarven.ca/#i
Received on Sunday, 10 September 2017 09:35:53 UTC