Re: "Scholarly HTML" and science.ai

Bill – I have to disagree with you on a big premise that you have made here.

Accessibility of content is _NOT_ the same as having well semantically structured content.  You can have rich semantics that are inaccessible and very accessible content w/o any semantic meaning.     Sure, there are overlaps but at a point they diverge and following either one too far down that specific path will take you away from the goal…

That said, I do agree with you that  a “perfect” portable document would incorporate both aspects so that it can be consumed by both (all) humans and (all) machines.  But as you said, mandating that is not possible or reasonable.

Leonard

From: Bill McCoy <bmccoy@idpf.org<mailto:bmccoy@idpf.org>>
Date: Tuesday, December 15, 2015 at 9:41 AM
To: Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>>
Cc: Dave Cramer <Dave.Cramer@hbgusa.com<mailto:Dave.Cramer@hbgusa.com>>, Charles LaPierre <charlesl@benetech.org<mailto:charlesl@benetech.org>>, W3C Digital Publishing IG <public-digipub-ig@w3.org<mailto:public-digipub-ig@w3.org>>
Subject: Re: "Scholarly HTML" and science.ai
Resent-From: <public-digipub-ig@w3.org<mailto:public-digipub-ig@w3.org>>
Resent-Date: Tuesday, December 15, 2015 at 9:42 AM

Ivan,

Your perspective about minimizing normative requirements to only those that are technically absolutely necessary is of course totally valid and correct from a pure engineering perspective. But, taken to the extreme, this would imply that no "SHOULDs" should be in any specifications, because something js only a SHOULD then ipso facto it's not necessary, it's really just stating a preference, and it can't be counted on it from an IOP perspective. Nevertheless most specifications have many "SHOULDs". Of course, this is an evergreen debate in standards groups.

With publications, the big fly in the ointment is accessibility. We cannot mandate that all the things that make content fully accessible are "musts", because in some instances there may be good reasons for particular publications not to do them. Yet we need content to be maximally accessible, and a key value proposition for a next-generation portable document format is that it is more naturally accessible than a sequence of "frozen" page images. Accessibility is also a "mine canary" for content that is more reliably machine-processable, supporting a variety of other use cases (summarization, remixing,

We have in EPUB danced around this for a long time. There are many SHOULDs relating to accessibility and even some MUSTS that are arguably overly specific. For example in EPUB 3.0 we made a MUST requirement that the Navigation Document's table of contents be complete and ordered per reading order. This was not technically necessary but was deemed a very critical piece of enabling accessibility.

In the big picture, arbitrary HTML+JS is legal in EPUB (as I think it should remain in PWP) but at the same time a tangled spaghetti of HTML+JS, that one can't do anything with except "execute" in a browser's VM, doesn't fully deliver on the expected value proposition of portable publications. Hence the need for additional (even if softer) normative requirements.

And for the in-process EPUB 3.1 update, the IDPF Board has asked the WG to be even more forceful about adding normative requirements to support the notion of an accessibility baseline, even though we know that not every valid EPUB publication will meet the baseline, and this is in process.

EPUB 3.x is in the process of being adopted as part of a number of accessibility mandates, so it would seem counter-productive for PWP to reverse course and be less promotive of accessibility.

So I recommend that this group look carefully at what's going on in EPUB 3.1, including the level of normative specification of accessibility/structure, and if there is anything there that looks out of line for what folks think should be in a PWP that should be elevated as a coordination issue. If that means we have to all work together to resolve the tension about how to strongly encourage accessibility (& the concomitant of well-structured content) while keeping the specs as pristine as possible, let's tackle it now not wait until EPUB 3.1 is baked in 6 months time.

--Bill

On Tue, Dec 15, 2015 at 2:44 AM, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> wrote:

On 14 Dec 2015, at 20:35, Bill McCoy <bmccoy@idpf.org<mailto:bmccoy@idpf.org>> wrote:

"vernacular" may not be quite the right term but... while some of the proposed Scholarly HTML is arguably specific to that domain, much of it, such as the "hunks" stuff, seems to have nothing to do with scholarly-publishing-specific requirements. There also seems to be significant overlap with the Structural Semantics Profile [1] that has been developed by IDPF and the EDUPUB Alliance as part of the EPUB for Education initiative (aka EDUPUB) as well as the Content Structure section of the EDUPUB profile itself [2]. These have also been recognized as not necessarily learning-content-specific but valuable for any content that wants to be well-structured (in particular that can be assured to be accessible) so it's now in the process of being generalized as part of EPUB 3.1 [3]. The realization was that stuff that is really about making well-structured content - not about the vertical of education content - should really be part of EPUB itself (even if something is not normatively required to be legal EPUB it may be required to be certifiably accessible EPUB and thus we are thinking that for EPUB 3.1 such things should be SHOULDs where sensible). I don't see why it would make sense for PWP to reverse course on that.

I do not think PWP would 'reverse' that, but I am not sure it should address that.

At least… there is a difference, if one think of terms of standards, between normative and non-normative aspects. Providing a sound set of 'advices' on well structured documents makes of course a lot of sense. Ie, having such a document is good but what I do not see is why this would be normative. When defining something like PWP, I believe that we should be as open ended as possible, and include normative requirements when it is technically absolutely necessary.

Of course, 'profiles', or 'vernaculars' may be different because the idea is to address a specific community, a specific market.


So I hope that we can both harmonize any redundancy between the new Scholarly HTML initiative and work that's gone on as part of EDUPUB, as well as appropriately pull out broadly useful features from both efforts into base specs, in the interests of maximizing accessibility and interoperability and minimizing bikeshedding.


Sure. But, for the time being, the Scholarly HTML work is 'just' a community group, ie, it is not a formal standardization activity. EDUPUB is much more formal than that. It is certainly a good idea, though, to draw attention at the EDUPUB document to the Scholarly HTML group; I will do that.

Ivan



--Bill

[1] http://www.idpf.org/epub/profiles/edu/structure/

[2] http://www.idpf.org/epub/profiles/edu/spec/#h.selsibtnscc8

[3] http://www.idpf.org/workplans/2015/epub/



On Mon, Dec 14, 2015 at 8:31 AM, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> wrote:

On 14 Dec 2015, at 17:01, Cramer, Dave <Dave.Cramer@hbgusa.com<mailto:Dave.Cramer@hbgusa.com>> wrote:

On Dec 14, 2015, at 10:50 AM, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> wrote:

On 14 Dec 2015, at 16:40, Charles LaPierre <charlesl@benetech.org<mailto:charlesl@benetech.org>> wrote:

Has anyone read about this before?  Looks interesting just trying to see how this fits in with our PWP and archiving.

http://scholarly.vernacular.io<http://scholarly.vernacular.io/> and https://science.ai<https://science.ai/>


Robin Berjon (one of the co-authors of that paper) has started a W3C Community Group on scholarly HTML:

https://www.w3.org/community/scholarlyhtml/


it is still in its early days, but it may be very interesting on long term.

Not sure yet how it will fit into PWP. In some sense, it may be orthogonal to PWP in the sense that what it tries to do is to define an HTML profile for scholarly publishing, to be used for particular use cases. These profiles, obviously, would fit PWP, too, but I do not believe it would create new requirements for it.

I think of the idea of a "vernacular" itself [1] is quite applicable. Our mission is to use HTML for publications. In order to make such publications more readable, more accessible, and more meaningful, we are likely to use HTML in specific ways. A good example is requiring a nav file. This idea of a vernacular has certainly helped me clarify my thinking on EPUB Zero as defined in the Readme [2]

I must admit I did not know the vernacular itself, only the scholarly HTML stuff.

Whether vernacular is necessary for PWP as a whole: I am not sure, that is to be seen. I fully agree that for specific areas (like scholarly HTML) defining a vernacular is probably a good idea (that is where the CG is going). And there may be similar issues for defining, say, legal publications. But all those are, or I believe should be, independent from the general approach on PWP which should try to be as non-restrictive as possible…

But practice will tell. In any case, it *is* an interesting document, that is for sure!

Thanks

Ivan




Dave

[1] http://vernacular.io<http://vernacular.io/>
[2] https://github.com/dauwhe/epub-zero/blob/gh-pages/readme.md


This may contain confidential material. If you are not an intended recipient, please notify the sender, delete immediately, and understand that no disclosure or reliance on the information herein is permitted. Hachette Book Group may monitor email to and from our network.


----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/

mobile: +31-641044153<tel:%2B31-641044153>
ORCID ID: http://orcid.org/0000-0003-0782-2704








--

Bill McCoy
Executive Director
International Digital Publishing Forum (IDPF)
email: bmccoy@idpf.org<mailto:bmccoy@idpf.org>
mobile: +1 206 353 0233<tel:%2B1%20206%20353%200233>



----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/

mobile: +31-641044153<tel:%2B31-641044153>
ORCID ID: http://orcid.org/0000-0003-0782-2704








--

Bill McCoy
Executive Director
International Digital Publishing Forum (IDPF)
email: bmccoy@idpf.org<mailto:bmccoy@idpf.org>
mobile: +1 206 353 0233<tel:%2B1%20206%20353%200233>

Received on Tuesday, 15 December 2015 15:24:12 UTC