Re: html for scholarly communication: RASH, Scholarly HTML or Dokieli? from Johannes Wilm on 2017-10-19 (public-scholarlyhtml@w3.org from October 2017)

From: Johannes Wilm <mail@johanneswilm.org>
Date: Thu, 19 Oct 2017 17:20:06 +0200
To: Sarven Capadisli <info@csarven.ca>
Cc: Scholarly HTML community group <public-scholarlyhtml@w3.org>
Message-ID: <CABkgm-S8_aLqFFXaEJ2T65mvHsZS4=XWiqKDX4WRg=FROCjBkw@mail.gmail.com>

On Thu, Oct 19, 2017 at 4:59 PM, Sarven Capadisli <info@csarven.ca> wrote:

> On 2017-10-19 07:13, Johannes Wilm wrote:
> > In the two cases mentioned here: Dokieli and Substance.io/eLife, Dokieli
> > seems to not filter the HTML (much?) so if I take arbitrary content for
> > example copying the guardian frontpage and pasting into Dokieli gives a
> > lot of garbage + margins I cannot control, etc. . In the case of
> > Substance, it filters the HTML down to what that application can handle.
>
>
> You are pasting "garbage", so you are seeing "garbage". What's the use
> case for pasting "garbage"? dokieli is not intended to handle "garbage"
> pasting.
>

Sorry, this was not meant to say that Doki.eli is garbage. When I pasted
the HTML from the Guardian frontpage, what ends up in the document is
content such as

"'); hiddenDoc.close(); })(); {"uid":1,"hostPeerName":"
https://www.theguardian.com
","initialGeometry":"{\"windowCoords_t\":0,\"windowCoords_r\":1920,\"windowCoords_b\":1053,\"windowCoords_l\":0,\"frameCoords_t\":3277,\"frameCoords_r\":1905,\"frameCoords_b\":3277,\"frameCoords_l\":0,\"styleZIndex\":\"auto\",\"allowedExpansion_t\":0,\"allowedExpansion_r\":0,\"allowedExpansion_b\":0,\"allowedExpansion_l\":0,\"xInView\":0,\"yInView\":0}"
"

(Basically code). This is what I tried to describe as "garbage". There is
nothing wrong with Doki.eli, but like all the other editors we have seen so
far, it cannot just handle arbitrary HTML in a smooth way without filtering
it down to a restricted subset of HTML of some kind.

If I try a less absurd paste, and instead just copy the contents of an
article and then paste it into Dokieli, the margins are all strange, and
the controls for headlines, etc. don't have any effect.

About 10 years ago or so, the filtering of the editor in CMSes like Joomla
was not very strong, and for less technically-inclined users who did not
know how to get rid of complex HTML structures, that often meant that their
posts often did looked strange and if they pasted things with margins into
the editor they oftentimes had no way of adjusting it.

>
> > The conventional logic is that unless you clearly define what restricted
> > version of HTML you permit, you cannot really create an editor that is
> > able to handle it all. But it sounds like the science.ai
> > <http://science.ai> people have been able to go beyond this. Is that
> > correctly understood?
> The HTML(+RDFa) patterns in Scholarly HTML, dokieli, and scienca.ai are
> very similar. The focus is mostly on RDFa for data reuse/exchange, as
> opposed to HTML. The observed HTML patterns just happens to be best
> practices. The CSS and JavaScript try to make the best of what's
> available in their respective ways. This doesn't mean that this approach
> is infinitely flexible or flawless. It just means that the constraints
> and the handling is elsewhere.
>

Ok, so the question here is: how do we avoid the issues mentioned above if
we do not restrict the HTML? Because basically if each one of our different
tools has it's own idea about what tags to allow and each creates a
slightly differently structured code, I cannot quite see how we are going
to make integrated systems with that where a document can move from say an
editing app, to a conversion app, to a printed version and to a commented
online version.

>
> -Sarven
> http://csarven.ca/#i
>
>

-- 
Johannes Wilm
http://www.johanneswilm.org
tel: +1 (520) 399 8880

Received on Thursday, 19 October 2017 15:20:33 UTC