Re: html for scholarly communication: RASH, Scholarly HTML or Dokieli?

Hey,

not limiting HTML within an editor sounds fascinating.

For many here this will be obvious, but the main reason why editors like
Fidus Writer until now have not liked the idea of arbitrary HTML is that
the task of programming an editor that can deal with any type of HTML/CSS
combination seems close to impossible because there are so many different
ways the same visual output can be expressed, and we really cannot be
entirely sure whether our editor can work with any particular combination
until we have tried out what the browser supports natively and then add a
lot of our own code to get around those particular bugs. For example, I
spent a lot of time trying to get the caret to move around inline canvas
elements a few years. Inline canvas elements may not be terribly common,
but they do exist, and we used them for something at the time. We have
since given up on that and only use more commonly used elements (and we
have switched to library rather than trying to tackle the problem of caret
movement by ourselves). But still, we continue to have a lot of code that
tries to standardize HTML. This starts with simple tasks such as handling
paste data coming from Google Docs or Microsoft Word: All the import
filters I have seen so far attempt as to find as much useful information in
the HTML as possible, and then throw the rest away.

Of course, we can try to write an editor that is as forgiving as possible,
but without a limitation of what HTML/CSS we allow, I don't think we really
"guarantee" to any of our users that they can use the output with the next
tool in the pipeline.

It would be really interesting to hear how you guys have overcome this
issue.

On Tue, Oct 17, 2017 at 1:11 PM, Stian Soiland-Reyes <
soiland-reyes@manchester.ac.uk> wrote:

> This looks really good, Sebastien!  I agree in that using structured
> RDFa/JSON-LD in free HTML is much preferably than trying to limit ourselves
> to a sub-set of HTML – as we see in this thread it is hard to reach
> agreement without also limiting future publication styles.  We should not
> be aiming to replicate 1960-style computer science papers with the odd
> hyperlink as the only enhancement.
>
>
>
>
>
> I like how well you have given full, yet clear examples for each concept,
> and re-used JSON-LD and schema.org.  This should be quite compatible with
> the effort of http://bioschemas.org/ which has a lot of traction in the
> biology/bioinformatics community (but many of their standards are general
> for academics)  – perhaps Publication could be added there based on your
> effort and then propagate into schema.org? Recommend you to get in touch
> – see http://bioschemas.org/howtojoin/
>
>
>
>
>
> I think the science.ai approach have lots of overlap with not just
> Scholarly HTML, but also our work on http://www.researchobject.org/ - in
> particular our Research Object Bundle https://w3id.org/bundle/ which also
> have a JSON-LD-based manifest  https://researchobject.
> github.io/specifications/bundle/#manifest – there we didn’t attempt to
> “deconstruct” the publication, but focused more on the supporting data and
> software sources to go along the black-box publication in the RO. Combing
> with your approach would allow embedding rich structured metadata that can
> then easily be extracted (say into separate annotations) using off the
> shelf RDFa/JSON-LD tools.
>
>
>
> There’s also concurrent work such as eLife’s Reproducible Document Stack
> https://elifesciences.org/labs/7dbeb390/reproducible-
> document-stack-supporting-the-next-generation-research-article - although
> that is working with JATS XML as the base format it has similar archiving
> considerations, and I’ve been pushing for them to add some kind of
> Scholarly HTML as an embedded format.
>
>
>
> One challenge as usual is how to squeeze the structured metadata out of
> the authors. eLife are working on interactive editors for this, similar
> HTML-based approaches are of course the previously mentioned
> https://dokie.li which in the WYSIWYG editor allow you to add microdata
> anywhere (as well as generating structural microdata for paragraphs etc).
>
>
>
>
>
>
>
> Side-note for manifest people:
>
> I see in https://nightly.science.ai/documentation/archive#graph-content
>  you have quite a minimal manifest (good!) as a @graph, but without
> relating the contained resources to the (implied) aggregation. This can
> make it hard to understand what is part of the aggregation (e.g. what you
> directly list under @graph), and what is just a sub-resource (like your
> DataDownload example). Is there a reason why you didn’t use a property to
> list these? We reused OAI-ORE ore:aggregates for this purpose (mapped
> through our JSON-LD context) – I think your archive is also in effect
> making an ore:Aggregation or even an ro:ResearchObject – so perhaps reuse
> of those would be beneficial.
>
>
>
>
>
> Happy to set up a call if you like to discuss further!
>
>
>
> --
> Stian Soiland-Reyes, eScience Lab
> School of Computer Science, The University of Manchester
> http://orcid.org/0000-0001-9842-9718
>
>
>
> *From: *sebastien <sebastien.ballesteros@gmail.com>
> *Sent: *16 October 2017 10:15
> *To: *public-scholarlyhtml@w3.org
> *Subject: *Re: html for scholarly communication: RASH, Scholarly HTML or
> Dokieli?
>
>
> Hello,
>
> A quick update on science.ai documentation effort.
>
> As Robin mentioned we have been iterating quite a lot on scholarly
> HTML internally. What we learned along the way (working with several
> established players in the field) is that trying to standardize or
> define constraints at the HTML level is somewhat too constraining (we
> are planning to provide more context on that soon).
>
> In our case, agreeing on a vocabulary and using RDFa and / or JSON-LD
> to express it (without additional constraints) has proven to be more
> productive.  For us, schema.org (and the process in place to extend
> it) provides enough basis to make that work. For that reason we are
> now mostly focused on exposing and documenting schema.org patterns
> that are useful in the context of scholarly publishing.
>
> I will post an updated link when our documentation hits our production
> website but in the meantime feel free to check out
> https://nightly.science.ai/documentation/archive if you are curious
> about what we have been doing since the days of
> http://scholarly.vernacular.io/.  If you look don't pay too much
> attention to the archive stuff, but the JSON-LD / RDFa examples should
> provide a good idea of the schema.org patterns that we have found
> useful in the context of scholarly publishing.
>
> Sebastien
>
>
>


-- 
Johannes Wilm
http://www.johanneswilm.org
tel: +1 (520) 399 8880

Received on Thursday, 19 October 2017 11:27:09 UTC