W3C home > Mailing lists > Public > public-digipub-ig@w3.org > December 2015

Re: "Scholarly HTML" and science.ai

From: Ivan Herman <ivan@w3.org>
Date: Tue, 15 Dec 2015 18:07:59 +0100
Cc: Dave Cramer <Dave.Cramer@hbgusa.com>, Charles LaPierre <charlesl@benetech.org>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-Id: <D0A0267F-C9A7-452A-9FB3-E99585364283@w3.org>
To: Bill McCoy <bmccoy@idpf.org>
Bill,

I understand the intentions. However, for the time being, I am a little bit of a different opinion as for the way to get there.

One of the goals of the PWP is to rely, as much as possible, on OWP. Put it differently, I think the PWP specific normative requirement should be minimal, and we should, whenever it is possible, try to require that the OWP in general should solve a particular problem in the first place. This can be painful but a helpful approach on long term: that means browsers (or their core engines) would be (in theory at least) the ones implementing what it takes, rather than publication specific engines.

If I take your accessibility example: yes, absolutely, this community requires a maximum possible accessibility. If vanilla HTML + the various WAI requirements and testing facilities do not do that by default then, but only then, we must analyze and decide what is required for a specific community. The same holds for HTML+JS vs. security issues, for example. But that discussion should, actually, happen on the Web in general, because, per se, it may not be PWP specific. Just as epub:type may gradually be exchanged against doc-aria terms, thereby binding the structural semantics terms to Accessibility API-s (a major win, eventually), we should look at every aspect similarly. In some case it may be painful (some of us may still bear some scars that we got around ARIA:-) but it is worth it. (B.t.w. we realized, while doing the DPUB ARIA document, that the terms introduced there are useful beyond EPUB & Co, and can be used for the Web at large; hence the renaming of those terms to doc-XXXX.)

Ie: my approach would be that PWP, for now, should accept any valid OWP resource. *Then* we could/should look at additional requirements to see whether we need to do something PWP specific, ie, make a step further away from this full generality. And we should *try* to avoid any of those, actually. I can also imagine that, many times, those extra restrictions would come from other communities that are not necessarily part of the publication related work (eg, that may very well be the case for HTML+JS+Security issues). Some of those restrictions may come from very different usage areas within publications; scholarly publication may have a different set of structures than, for example, texts for legal publications, and these communities may then define their own, possibly contradicting profiles. PWP should be open to all of those, though probably leading to various PWP profiles. Actually… this is what happened to EDUPUB, right? EDUPUB is a profile of EPUB, ie, not all EPUB documents are required to be bound by the EDUPUB specifications...

Again, I do not think we fundamentally disagree in the goals, just have different views on how to get there…

Cheers

Ivan

> On 15 Dec 2015, at 15:41, Bill McCoy <bmccoy@idpf.org> wrote:
> 
> Ivan,
> 
> Your perspective about minimizing normative requirements to only those that are technically absolutely necessary is of course totally valid and correct from a pure engineering perspective. But, taken to the extreme, this would imply that no "SHOULDs" should be in any specifications, because something js only a SHOULD then ipso facto it's not necessary, it's really just stating a preference, and it can't be counted on it from an IOP perspective. Nevertheless most specifications have many "SHOULDs". Of course, this is an evergreen debate in standards groups.
> 
> With publications, the big fly in the ointment is accessibility. We cannot mandate that all the things that make content fully accessible are "musts", because in some instances there may be good reasons for particular publications not to do them. Yet we need content to be maximally accessible, and a key value proposition for a next-generation portable document format is that it is more naturally accessible than a sequence of "frozen" page images. Accessibility is also a "mine canary" for content that is more reliably machine-processable, supporting a variety of other use cases (summarization, remixing,
> 
> We have in EPUB danced around this for a long time. There are many SHOULDs relating to accessibility and even some MUSTS that are arguably overly specific. For example in EPUB 3.0 we made a MUST requirement that the Navigation Document's table of contents be complete and ordered per reading order. This was not technically necessary but was deemed a very critical piece of enabling accessibility.
> 
> In the big picture, arbitrary HTML+JS is legal in EPUB (as I think it should remain in PWP) but at the same time a tangled spaghetti of HTML+JS, that one can't do anything with except "execute" in a browser's VM, doesn't fully deliver on the expected value proposition of portable publications. Hence the need for additional (even if softer) normative requirements.
> 
> And for the in-process EPUB 3.1 update, the IDPF Board has asked the WG to be even more forceful about adding normative requirements to support the notion of an accessibility baseline, even though we know that not every valid EPUB publication will meet the baseline, and this is in process.
> 
> EPUB 3.x is in the process of being adopted as part of a number of accessibility mandates, so it would seem counter-productive for PWP to reverse course and be less promotive of accessibility.
> 
> So I recommend that this group look carefully at what's going on in EPUB 3.1, including the level of normative specification of accessibility/structure, and if there is anything there that looks out of line for what folks think should be in a PWP that should be elevated as a coordination issue. If that means we have to all work together to resolve the tension about how to strongly encourage accessibility (& the concomitant of well-structured content) while keeping the specs as pristine as possible, let's tackle it now not wait until EPUB 3.1 is baked in 6 months time.
> 
> --Bill
> 
> On Tue, Dec 15, 2015 at 2:44 AM, Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>> wrote:
> 
>> On 14 Dec 2015, at 20:35, Bill McCoy <bmccoy@idpf.org <mailto:bmccoy@idpf.org>> wrote:
>> 
>> "vernacular" may not be quite the right term but... while some of the proposed Scholarly HTML is arguably specific to that domain, much of it, such as the "hunks" stuff, seems to have nothing to do with scholarly-publishing-specific requirements. There also seems to be significant overlap with the Structural Semantics Profile [1] that has been developed by IDPF and the EDUPUB Alliance as part of the EPUB for Education initiative (aka EDUPUB) as well as the Content Structure section of the EDUPUB profile itself [2]. These have also been recognized as not necessarily learning-content-specific but valuable for any content that wants to be well-structured (in particular that can be assured to be accessible) so it's now in the process of being generalized as part of EPUB 3.1 [3]. The realization was that stuff that is really about making well-structured content - not about the vertical of education content - should really be part of EPUB itself (even if something is not normatively required to be legal EPUB it may be required to be certifiably accessible EPUB and thus we are thinking that for EPUB 3.1 such things should be SHOULDs where sensible). I don't see why it would make sense for PWP to reverse course on that.
> 
> I do not think PWP would 'reverse' that, but I am not sure it should address that.
> 
> At least… there is a difference, if one think of terms of standards, between normative and non-normative aspects. Providing a sound set of 'advices' on well structured documents makes of course a lot of sense. Ie, having such a document is good but what I do not see is why this would be normative. When defining something like PWP, I believe that we should be as open ended as possible, and include normative requirements when it is technically absolutely necessary.
> 
> Of course, 'profiles', or 'vernaculars' may be different because the idea is to address a specific community, a specific market.
> 
>> 
>> So I hope that we can both harmonize any redundancy between the new Scholarly HTML initiative and work that's gone on as part of EDUPUB, as well as appropriately pull out broadly useful features from both efforts into base specs, in the interests of maximizing accessibility and interoperability and minimizing bikeshedding.
>> 
> 
> Sure. But, for the time being, the Scholarly HTML work is 'just' a community group, ie, it is not a formal standardization activity. EDUPUB is much more formal than that. It is certainly a good idea, though, to draw attention at the EDUPUB document to the Scholarly HTML group; I will do that.
> 
> Ivan
> 
> 
> 
>> --Bill
>> 
>> [1] http://www.idpf.org/epub/profiles/edu/structure/ <http://www.idpf.org/epub/profiles/edu/structure/>
>> [2] http://www.idpf.org/epub/profiles/edu/spec/#h.selsibtnscc8 <http://www.idpf.org/epub/profiles/edu/spec/#h.selsibtnscc8>
>> [3] http://www.idpf.org/workplans/2015/epub/ <http://www.idpf.org/workplans/2015/epub/>
>> 
>> 
>> On Mon, Dec 14, 2015 at 8:31 AM, Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>> wrote:
>> 
>>> On 14 Dec 2015, at 17:01, Cramer, Dave <Dave.Cramer@hbgusa.com <mailto:Dave.Cramer@hbgusa.com>> wrote:
>>> 
>>> On Dec 14, 2015, at 10:50 AM, Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>> wrote:
>>> 
>>>>> On 14 Dec 2015, at 16:40, Charles LaPierre <charlesl@benetech.org <mailto:charlesl@benetech.org>> wrote:
>>>>> 
>>>>> Has anyone read about this before?  Looks interesting just trying to see how this fits in with our PWP and archiving.
>>>>> 
>>>>>> http://scholarly.vernacular.io <http://scholarly.vernacular.io/> and https://science.ai <https://science.ai/>
>>> 
>>> 
>>>> Robin Berjon (one of the co-authors of that paper) has started a W3C Community Group on scholarly HTML:
>>>> 
>>>> https://www.w3.org/community/scholarlyhtml/ <https://www.w3.org/community/scholarlyhtml/>
>>>> 
>>>> it is still in its early days, but it may be very interesting on long term.
>>>> 
>>>> Not sure yet how it will fit into PWP. In some sense, it may be orthogonal to PWP in the sense that what it tries to do is to define an HTML profile for scholarly publishing, to be used for particular use cases. These profiles, obviously, would fit PWP, too, but I do not believe it would create new requirements for it.
>>> 
>>> I think of the idea of a "vernacular" itself [1] is quite applicable. Our mission is to use HTML for publications. In order to make such publications more readable, more accessible, and more meaningful, we are likely to use HTML in specific ways. A good example is requiring a nav file. This idea of a vernacular has certainly helped me clarify my thinking on EPUB Zero as defined in the Readme [2]
>> 
>> I must admit I did not know the vernacular itself, only the scholarly HTML stuff.
>> 
>> Whether vernacular is necessary for PWP as a whole: I am not sure, that is to be seen. I fully agree that for specific areas (like scholarly HTML) defining a vernacular is probably a good idea (that is where the CG is going). And there may be similar issues for defining, say, legal publications. But all those are, or I believe should be, independent from the general approach on PWP which should try to be as non-restrictive as possible…
>> 
>> But practice will tell. In any case, it *is* an interesting document, that is for sure!
>> 
>> Thanks
>> 
>> Ivan
>> 
>> 
>> 
>>> 
>>> Dave
>>> 
>>> [1] http://vernacular.io <http://vernacular.io/>
>>> [2] https://github.com/dauwhe/epub-zero/blob/gh-pages/readme.md <https://github.com/dauwhe/epub-zero/blob/gh-pages/readme.md>
>>> This may contain confidential material. If you are not an intended recipient, please notify the sender, delete immediately, and understand that no disclosure or reliance on the information herein is permitted. Hachette Book Group may monitor email to and from our network.
>> 
>> 
>> ----
>> Ivan Herman, W3C
>> Digital Publishing Lead
>> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
>> mobile: +31-641044153 <tel:%2B31-641044153>
>> ORCID ID: http://orcid.org/0000-0003-0782-2704 <http://orcid.org/0000-0003-0782-2704>
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> 
>> Bill McCoy
>> Executive Director
>> International Digital Publishing Forum (IDPF)
>> email: bmccoy@idpf.org <mailto:bmccoy@idpf.org>
>> mobile: +1 206 353 0233 <tel:%2B1%20206%20353%200233>
>> 
> 
> 
> 
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
> mobile: +31-641044153 <tel:%2B31-641044153>
> ORCID ID: http://orcid.org/0000-0003-0782-2704 <http://orcid.org/0000-0003-0782-2704>
> 
> 
> 
> 
> 
> 
> 
> --
> 
> Bill McCoy
> Executive Director
> International Digital Publishing Forum (IDPF)
> email: bmccoy@idpf.org <mailto:bmccoy@idpf.org>
> mobile: +1 206 353 0233 <tel:%2B1%20206%20353%200233>
> 


----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704





Received on Tuesday, 15 December 2015 17:08:31 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:20 UTC