W3C home > Mailing lists > Public > public-digipub-ig@w3.org > December 2015

RE: [DPUB][PWP][EPUB31]RE: PWP and EPUB [was "Scholarly HTML" and science.ai]

From: Bill Kasdorf <bkasdorf@apexcovantage.com>
Date: Wed, 16 Dec 2015 15:50:58 +0000
To: Ivan Herman <ivan@w3.org>
CC: Bill McCoy <bmccoy@idpf.org>, Dave Cramer <Dave.Cramer@hbgusa.com>, Charles LaPierre <charlesl@benetech.org>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-ID: <CY1PR0601MB14227DED8D5C8DB3A65C7B0ADFEF0@CY1PR0601MB1422.namprd06.prod.outlook.com>
I meant this in the most general sense, and was making a statement about where I see this leading (not about where we are yet).

That is, in the same sense that EDUPUB is a profile of EPUB for Education (adding stricter requirements to a more general spec in order to optimize it for a particular purpose, while still being in every way conformant to that more general spec), I was speculating that where _we may wind up_ is that EPUB will be a specification (profile) of _how to implement PWP_ in a specific way that makes it more predictable and consistent (while still in every way conforming to PWP, which must be a very broadly accommodating model that basically allows pretty much anything in the OWP spectrum to be contained in it, with minimal or no "but you must do it this way" aspects).

I was also not implying any limitation or restriction to the development of PWP. In fact just the opposite: I am quite sure you and I agree that it needs to be as open, flexible, agnostic, and accommodating as possible within the context of the OWP, so that basically _any_ online content can be a PWP. That is the fundamental vision of PWP. But as articulated by Bill M, EPUB wants to be more specific and restrictive than that, in order to establish a predictable, consistent model.

I guess another point is that there is an inherent tension between "accommodating any OWP content, expressed in any OWP-sanctioned way" and "but let's agree to do it this way so authors know what to do in order to call their publication an EPUB, and the ecosystem basically knows what it is going to get when it gets a publication called an EPUB." Trying to do both with a single spec will require one or the other or both to be compromised. But we still want EPUBs to align in all respects with the PWP vision (at least I do). I don't want to see competing specs come out of this.

So where I think we will wind up (time will tell), and what I think makes sense, is that eventually all EPUBs are PWPs but not all PWPs are EPUBs. In the long term.

This is consistent with our intent all along that these should be in harmony, not in conflict. That doesn't mean that they have to be identical.

That's really all I was pointing out at this stage.

My generic language is deliberate.

--Bill

From: Ivan Herman [mailto:ivan@w3.org]
Sent: Wednesday, December 16, 2015 4:16 AM
To: Bill Kasdorf
Cc: Bill McCoy; Dave Cramer; Charles LaPierre; W3C Digital Publishing IG
Subject: Re: [DPUB][PWP][EPUB31]RE: PWP and EPUB [was "Scholarly HTML" and science.ai]

I am not sure, Bill. (And I say this literally, and not as a generic English phrase that, as far as I know, in educated British circles would mean "I do not have a clue":-)

We do not really have a clear idea what a 'profile' means for HTML nor for PWP, for that matter. Is it a validation mechanism, is it a best practice, is it…. I am also not sure how the exact relationships of PWP vs. EPUB will evolve. Would such a statement like yours mean to apply to the profiles of HTML/CSS used in EPUB being inherited by PWP, or is it also for other documents like navigations?

I would prefer the HTML and CSS crowd to formalize what a 'profile' means, with PWP just a recipient for the outcome. The separate discussion on vernaculars/profiles is great because it brings these issues to the fore. Let us worry about the relationships with EPUB later…

Cheers

Ivan



On 16 Dec 2015, at 00:19, Bill Kasdorf <bkasdorf@apexcovantage.com<mailto:bkasdorf@apexcovantage.com>> wrote:

Where this appears to be leading, imho, is that EPUB would be a profile of PWP. Which btw I think would be very useful. The essence if EPUB is to be predictable. The essence of PWP is to be accommodating. We need both.
--Bill K

From: Bill McCoy [mailto:bmccoy@idpf.org]
Sent: Tuesday, December 15, 2015 12:42 PM
To: Ivan Herman
Cc: Dave Cramer; Charles LaPierre; W3C Digital Publishing IG
Subject: Re: "Scholarly HTML" and science.ai

Hi Ivan, I'm not sure we fundamentally disagree. Maybe there is just a "layer cake" here.

I think that in the broadest possible conception of PWP as an integral aspect of OWP you are right - normative statements or really anything that constrains OWP should be minimized/avoided.

But when looking at PWP as in effect an EPUB 4 I think of it a bit differently.

I think many of us, myself included, sometimes are thinking about PWP as a basic mechanism to extend the Web architecture to cover offline publications and sometimes are thinking about PWP as the successor to today's EPUB. Which leads to confusion.

So maybe we just need to somehow tease apart the inherent tension between the "making OWP support publications natively" part of PWP and the "be the next major update to EPUB" part of PWP.

By way of example, in OWP there are a thousand ways one might devise Web content that contains pre-recorded audio synchronized with text rendering. In EPUB, we have one "blessed" way to do this: Media Overlays, which enables content to be created that doesn't have to implement the synchronization part by itself, so it much simpler to author and is interoperable across reading systems.

"Should Media Overlays should be part of PWP?" ... With PWP defined to be a very high-level goal around supporting seamless online/offline delivery of Web content, this seems to be a very open question and I might think the answer could be "no".

Should Media Overlays be part of an EPUB 4?... For that question, I think the answer is obviously "yes". EPUB is already a profile of OWP that more specifically constrains what it contains, in the interest of more interoperability of "content as data" (including accessibility).

So to me folding EDUPUB structural semantics into EPUB 3.1 wouldn't change the situation it would just add more fuel to the fire.

--Bill


On Tue, Dec 15, 2015 at 9:07 AM, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> wrote:
Bill,

I understand the intentions. However, for the time being, I am a little bit of a different opinion as for the way to get there.

One of the goals of the PWP is to rely, as much as possible, on OWP. Put it differently, I think the PWP specific normative requirement should be minimal, and we should, whenever it is possible, try to require that the OWP in general should solve a particular problem in the first place. This can be painful but a helpful approach on long term: that means browsers (or their core engines) would be (in theory at least) the ones implementing what it takes, rather than publication specific engines.

If I take your accessibility example: yes, absolutely, this community requires a maximum possible accessibility. If vanilla HTML + the various WAI requirements and testing facilities do not do that by default then, but only then, we must analyze and decide what is required for a specific community. The same holds for HTML+JS vs. security issues, for example. But that discussion should, actually, happen on the Web in general, because, per se, it may not be PWP specific. Just as epub:type may gradually be exchanged against doc-aria terms, thereby binding the structural semantics terms to Accessibility API-s (a major win, eventually), we should look at every aspect similarly. In some case it may be painful (some of us may still bear some scars that we got around ARIA:-) but it is worth it. (B.t.w. we realized, while doing the DPUB ARIA document, that the terms introduced there are useful beyond EPUB & Co, and can be used for the Web at large; hence the renaming of those terms to doc-XXXX.)

Ie: my approach would be that PWP, for now, should accept any valid OWP resource. *Then* we could/should look at additional requirements to see whether we need to do something PWP specific, ie, make a step further away from this full generality. And we should *try* to avoid any of those, actually. I can also imagine that, many times, those extra restrictions would come from other communities that are not necessarily part of the publication related work (eg, that may very well be the case for HTML+JS+Security issues). Some of those restrictions may come from very different usage areas within publications; scholarly publication may have a different set of structures than, for example, texts for legal publications, and these communities may then define their own, possibly contradicting profiles. PWP should be open to all of those, though probably leading to various PWP profiles. Actually… this is what happened to EDUPUB, right? EDUPUB is a profile of EPUB, ie, not all EPUB documents are required to be bound by the EDUPUB specifications...

Again, I do not think we fundamentally disagree in the goals, just have different views on how to get there…

Cheers

Ivan

On 15 Dec 2015, at 15:41, Bill McCoy <bmccoy@idpf.org<mailto:bmccoy@idpf.org>> wrote:

Ivan,

Your perspective about minimizing normative requirements to only those that are technically absolutely necessary is of course totally valid and correct from a pure engineering perspective. But, taken to the extreme, this would imply that no "SHOULDs" should be in any specifications, because something js only a SHOULD then ipso facto it's not necessary, it's really just stating a preference, and it can't be counted on it from an IOP perspective. Nevertheless most specifications have many "SHOULDs". Of course, this is an evergreen debate in standards groups.

With publications, the big fly in the ointment is accessibility. We cannot mandate that all the things that make content fully accessible are "musts", because in some instances there may be good reasons for particular publications not to do them. Yet we need content to be maximally accessible, and a key value proposition for a next-generation portable document format is that it is more naturally accessible than a sequence of "frozen" page images. Accessibility is also a "mine canary" for content that is more reliably machine-processable, supporting a variety of other use cases (summarization, remixing,

We have in EPUB danced around this for a long time. There are many SHOULDs relating to accessibility and even some MUSTS that are arguably overly specific. For example in EPUB 3.0 we made a MUST requirement that the Navigation Document's table of contents be complete and ordered per reading order. This was not technically necessary but was deemed a very critical piece of enabling accessibility.

In the big picture, arbitrary HTML+JS is legal in EPUB (as I think it should remain in PWP) but at the same time a tangled spaghetti of HTML+JS, that one can't do anything with except "execute" in a browser's VM, doesn't fully deliver on the expected value proposition of portable publications. Hence the need for additional (even if softer) normative requirements.

And for the in-process EPUB 3.1 update, the IDPF Board has asked the WG to be even more forceful about adding normative requirements to support the notion of an accessibility baseline, even though we know that not every valid EPUB publication will meet the baseline, and this is in process.

EPUB 3.x is in the process of being adopted as part of a number of accessibility mandates, so it would seem counter-productive for PWP to reverse course and be less promotive of accessibility.

So I recommend that this group look carefully at what's going on in EPUB 3.1, including the level of normative specification of accessibility/structure, and if there is anything there that looks out of line for what folks think should be in a PWP that should be elevated as a coordination issue. If that means we have to all work together to resolve the tension about how to strongly encourage accessibility (& the concomitant of well-structured content) while keeping the specs as pristine as possible, let's tackle it now not wait until EPUB 3.1 is baked in 6 months time.

--Bill

On Tue, Dec 15, 2015 at 2:44 AM, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> wrote:

On 14 Dec 2015, at 20:35, Bill McCoy <bmccoy@idpf.org<mailto:bmccoy@idpf.org>> wrote:

"vernacular" may not be quite the right term but... while some of the proposed Scholarly HTML is arguably specific to that domain, much of it, such as the "hunks" stuff, seems to have nothing to do with scholarly-publishing-specific requirements. There also seems to be significant overlap with the Structural Semantics Profile [1] that has been developed by IDPF and the EDUPUB Alliance as part of the EPUB for Education initiative (aka EDUPUB) as well as the Content Structure section of the EDUPUB profile itself [2]. These have also been recognized as not necessarily learning-content-specific but valuable for any content that wants to be well-structured (in particular that can be assured to be accessible) so it's now in the process of being generalized as part of EPUB 3.1 [3]. The realization was that stuff that is really about making well-structured content - not about the vertical of education content - should really be part of EPUB itself (even if something is not normatively required to be legal EPUB it may be required to be certifiably accessible EPUB and thus we are thinking that for EPUB 3.1 such things should be SHOULDs where sensible). I don't see why it would make sense for PWP to reverse course on that.

I do not think PWP would 'reverse' that, but I am not sure it should address that.

At least… there is a difference, if one think of terms of standards, between normative and non-normative aspects. Providing a sound set of 'advices' on well structured documents makes of course a lot of sense. Ie, having such a document is good but what I do not see is why this would be normative. When defining something like PWP, I believe that we should be as open ended as possible, and include normative requirements when it is technically absolutely necessary.

Of course, 'profiles', or 'vernaculars' may be different because the idea is to address a specific community, a specific market.




So I hope that we can both harmonize any redundancy between the new Scholarly HTML initiative and work that's gone on as part of EDUPUB, as well as appropriately pull out broadly useful features from both efforts into base specs, in the interests of maximizing accessibility and interoperability and minimizing bikeshedding.


Sure. But, for the time being, the Scholarly HTML work is 'just' a community group, ie, it is not a formal standardization activity. EDUPUB is much more formal than that. It is certainly a good idea, though, to draw attention at the EDUPUB document to the Scholarly HTML group; I will do that.

Ivan





--Bill

[1] http://www.idpf.org/epub/profiles/edu/structure/

[2] http://www.idpf.org/epub/profiles/edu/spec/#h.selsibtnscc8

[3] http://www.idpf.org/workplans/2015/epub/



On Mon, Dec 14, 2015 at 8:31 AM, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> wrote:

On 14 Dec 2015, at 17:01, Cramer, Dave <Dave.Cramer@hbgusa.com<mailto:Dave.Cramer@hbgusa.com>> wrote:

On Dec 14, 2015, at 10:50 AM, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> wrote:

On 14 Dec 2015, at 16:40, Charles LaPierre <charlesl@benetech.org<mailto:charlesl@benetech.org>> wrote:

Has anyone read about this before?  Looks interesting just trying to see how this fits in with our PWP and archiving.



http://scholarly.vernacular.io<http://scholarly.vernacular.io/> and https://science.ai<https://science.ai/>




Robin Berjon (one of the co-authors of that paper) has started a W3C Community Group on scholarly HTML:

https://www.w3.org/community/scholarlyhtml/


it is still in its early days, but it may be very interesting on long term.

Not sure yet how it will fit into PWP. In some sense, it may be orthogonal to PWP in the sense that what it tries to do is to define an HTML profile for scholarly publishing, to be used for particular use cases. These profiles, obviously, would fit PWP, too, but I do not believe it would create new requirements for it.

I think of the idea of a "vernacular" itself [1] is quite applicable. Our mission is to use HTML for publications. In order to make such publications more readable, more accessible, and more meaningful, we are likely to use HTML in specific ways. A good example is requiring a nav file. This idea of a vernacular has certainly helped me clarify my thinking on EPUB Zero as defined in the Readme [2]

I must admit I did not know the vernacular itself, only the scholarly HTML stuff.

Whether vernacular is necessary for PWP as a whole: I am not sure, that is to be seen. I fully agree that for specific areas (like scholarly HTML) defining a vernacular is probably a good idea (that is where the CG is going). And there may be similar issues for defining, say, legal publications. But all those are, or I believe should be, independent from the general approach on PWP which should try to be as non-restrictive as possible…

But practice will tell. In any case, it *is* an interesting document, that is for sure!

Thanks

Ivan






Dave

[1] http://vernacular.io<http://vernacular.io/>
[2] https://github.com/dauwhe/epub-zero/blob/gh-pages/readme.md


This may contain confidential material. If you are not an intended recipient, please notify the sender, delete immediately, and understand that no disclosure or reliance on the information herein is permitted. Hachette Book Group may monitor email to and from our network.


----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153<tel:%2B31-641044153>
ORCID ID: http://orcid.org/0000-0003-0782-2704







--

Bill McCoy
Executive Director
International Digital Publishing Forum (IDPF)
email: bmccoy@idpf.org<mailto:bmccoy@idpf.org>
mobile: +1 206 353 0233<tel:%2B1%20206%20353%200233>



----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153<tel:%2B31-641044153>
ORCID ID: http://orcid.org/0000-0003-0782-2704







--

Bill McCoy
Executive Director
International Digital Publishing Forum (IDPF)
email: bmccoy@idpf.org<mailto:bmccoy@idpf.org>
mobile: +1 206 353 0233<tel:%2B1%20206%20353%200233>



----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153<tel:%2B31-641044153>
ORCID ID: http://orcid.org/0000-0003-0782-2704







--

Bill McCoy
Executive Director
International Digital Publishing Forum (IDPF)
email: bmccoy@idpf.org<mailto:bmccoy@idpf.org>
mobile: +1 206 353 0233



----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704



Received on Wednesday, 16 December 2015 15:51:35 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:20 UTC