W3C home > Mailing lists > Public > public-digipub-ig@w3.org > August 2015

Re: "Completeness" as a feature of a POW (aka EPUB+Web)??

From: Leonard Rosenthol <lrosenth@adobe.com>
Date: Thu, 13 Aug 2015 13:27:31 +0000
To: Bill McCoy <whmccoy@gmail.com>
CC: W3C Digital Publishing IG <public-digipub-ig@w3.org>, Bill McCoy <bmccoy@idpf.org>
Message-ID: <3DC50734-3961-4B5D-97CA-F1D2039B8B5E@adobe.com>
I absolutely agree, Bill.   Idempotency is again another axis of consideration in the differences between a document and a package of arbitrary (OWP or otherwise) content.   Certainly there will be packages that require either side of this – and in some unique cases – both types in a single package.

I also think that these distinctions will come into play very heavily as we look at the intermediate state of “cached”.   Is a cached file always fully idempotent?  Can you have a partial cache?  Can the author specify what should (or should not) be cached – and then how does that impact it’s “completeness”?    (etc.)


From: Bill McCoy
Date: Wednesday, August 12, 2015 at 9:06 PM
To: Leonard Rosenthol
Cc: W3C Digital Publishing IG, Bill McCoy
Subject: Re: "Completeness" as a feature of a POW (aka EPUB+Web)??

I would suggest people also consider and opine about  a related attribute to completeness/self-containedness (that publication resources live inside an actual or virtual package file)... what we might call "idempotency" (that publication resources don't change dynamically).  To me this is fundamental to what a "portable document" means vs. arbitrary OWP content, and it is orthogonal to whether content is packaged or not (which Id don't see as  fundamental to portable document-ness). In a cloud reader view of a publication (such as: http://idpf.org/sites/default/files/cloud-reader/index.html?epub=epub_content%2Faccessible_epub_3) a user may not know or care whether the content was packaged into a single file or not.

To me what fundamentally distinguishes portable documents from arbitrary websites is solely that portable documents "promise" a reliable consumption experience without respect of any particular server infrastructure and, especially, without such server infrastructure providing interactivity.

In terms of the REST architecture of the Web, portable documents always have resource=representation, vs. arbitrary websites in which resources (URIs/URLs) are free to return different representations at different times.  Interactivity is possible, but only via the code-on-demand feature of the REST architecture.

This property makes things like archival and annotations straightforward to do with portable documents, whereas things things are impossible, in the general case, with websites (how can I archive or annotate expedia.com<http://expedia.com>, which is personalized to me and whose functionality is dynamic and embodied in a bunch of server infrastructure?).

I don't see  idempotency as a black/white attribute, especially as we get into things like advertising in digital publications, but to me the more something has this property the more I'm inclined to think of it as a "portable document" in nature, the less it has the property the more I'm inclined to think of it as being more general "Web stuff". It is related to packaging in that if content is fully packaged it ipso facto also has the idempotency attribute (but the converse does not hold). To evolve today's EPUB into being a full solution for "portable documents for the Web" it seems pretty clear to me that we'll need to both untangle these two attributes and better explore the "gray area" of content that is partially but not completely idempotent (which is not necessarily equivalent to being partially packaged).


On Wed, Aug 12, 2015 at 5:31 PM, Leonard Rosenthol <lrosenth@adobe.com<mailto:lrosenth@adobe.com>> wrote:
In rewriting the document about Portable Documents for the web (thanks for the suggestion & link, Tzviya), I can across the following paragraph:

EPUB can be viewed as simply defining a specialization of Web content that assures that a collection of content items has the needed properties of completeness and logical structure, and does so in a standard way that other processing tools and services can reliably create, manipulate, and present such collections. This completeness constraint is key for bridging the current gap between an online and offline/portable view of the same content (see <a href="#whynow">section on usage patterns</a> below).

While not spelled out here or in the “section on usage patterns”, I am going to take the terminology of “completeness” to mean “fully self-contained” (aka no external references).  If it means something else, feel free to ignore what follows (but only after you correct me :).

In the current use cases for EPUB (books, magazines, etc.), the desire by the publisher to have everything contained inside the package is clearly key – just as that same property has been a tenant of the various PDF subset standards (PDF/A, PDF/X, etc.)  However, there also exists for PDF use cases where external references are a key aspect to the workflow – for example, external content or color profiles in a variable or transactional workflow (eg. PDF/VT).   As such, I would like to suggest that as a portable document for OWP, that there also needs to be a provision for external references in this POW (Portable Open Web) format.

I know that there have been discussions about this around EPUB in the past for large assets (eg. Video and audio), but I would put forth that the same principles could also be applied for other types of content as well.  Be it advertisements in a publication, current data sets in a STEM publication or even just a reference to the latest version of a common JS library used by the publication.

What do others think about this?   Is completeness/self-contained a requirement in a POW?


Received on Thursday, 13 August 2015 13:28:13 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:08 UTC