Re: Authoring versus Interchange from Johannes Wilm on 2015-12-02 (public-scholarlyhtml@w3.org from December 2015)

From: Johannes Wilm <johanneswilm@vivliostyle.com>
Date: Wed, 2 Dec 2015 18:55:07 +0100
To: Robin Berjon <robin@berjon.com>
Cc: Florian Rivoal <florian@rivoal.net>, public-scholarlyhtml@w3.org
Message-ID: <CABkgm-RhJx8eT3wEH6fPR3YQG39XZb9FDOSuV27huO_BmA9PJw@mail.gmail.com>
On Wed, Dec 2, 2015 at 4:05 PM, Robin Berjon <robin@berjon.com> wrote:

> On 02/12/2015 03:12 , Johannes Wilm wrote:
> > Could you think of an example of when this would happen? If you write
> > the document by hand, and it needs to be written with the level of
> > specificity as what is later needed for interchange, wouldn't you need
> > to write it as complex as well?
>
> In practice, no.
>
> I think that the big difference between an authoring and an interchange
> format is whether a transformation step is required before the content
> can be consumed by a relative generic processor (say, something that
> understands HTML + DPUB ARIA + RDFa + schema.org) so as to obtain the
> same semantics.


> One example from our SH is the markup used for authors:
> http://scholarly.vernacular.io/#authors. The use of schema.org roles,
> the indirection for affiliations, the high markup to text ratio don't
> make this friendly to type by any metric. If I expected this to be
> hand-authored, I would not wish this on anyone.
>
> The semantics are, however, correct even without knowing that this is
> SH. A general purpose crawler is able to look at that and fully
> understand what you're talking about without ever knowing that this is SH.
>
> By contrast, were I designing an authoring format for this I would have
> gone with something more like:
>
> <script type=json/authors>
> [
>   {
>     "given": "Robin",
>     "family": "Berjon",
>     "url": "http://berjon.com/",
>     "org": { "name": "SA", "url": "http://science.ai/" },
>     "corresponding": true
>   },
>   ...
> ]
> </script>
>
> And indeed: http://www.w3.org/respec/guide.html#editors-authors.
>
> That's a lot easier to remember, in fact I know I've used the ReSpec
> syntax for that a million times without having to go back to the docs.
> But to general-purpose processors, it is meaningless.
>

I see. Yes, if you use it a million times, you can probably remember most
things.

I would still think that it's not something most authors who don't use it
on a daily basis will remember. When I needed to update the respecConfig of
some of the editing TF documents last time, I think I spent about 2-3 hours
trying to figure out how to specify the links correctly. First by looking
at documentation, then by looking at other spec documents. If I need to
change them again, I probably have to go through that same process again.
It doesn't help that some W3C documents use bikeshed and others respec. :)

That things will be difficult to remember is a problem that comes with the
complexity and that noone will be able to make go away.

However, I am fine with a "HTML + DPUB ARIA + RDFa + schema.org" based
solution if that means that we can get a larger amount of tools developed
for this format.



> We get better interoperability by reusing what exists rather than
> reinventing more convenient syntaxes. That's what makes it an
> interchange format more than an authoring format.
>
> >From the perspective of tools, it does not make a huge difference. The
> HTML+RDFa version is a little bit harder (seriously harder if you want
> to support round-tripping, but that's not required) but overall you can
> have the same form-based UI in which authors can enter a list of people
> defined by straightforward fields.
>
> > Some of the markup will have to be somewhat complex -- for example
> > citations that have both text before and after them and that need to be
> > able to specify something else than pages as reference. Every few months
> > someone seems to try to invent a new dialect of markdown for academics
> > to get away from the difficulty of writing latex, but once they run into
> > citations they end up either not being able to support most of the
> > required features or defining something that is as complex as latex. So
> > users who choose to write it by hand will have to look the variable
> > names up when using them.
>
> It is a law of wikitext syntaxes that they will grow in complexity until
> they have the full flexibility of HTML, only much, much uglier. The same
> applies to Markdown.
>

:) exactly.


>
> --
> • Robin Berjon - http://berjon.com/ - @robinberjon
> • http://science.ai/ — intelligent science publishing
> •
>
Received on Wednesday, 2 December 2015 17:55:44 UTC