W3C home > Mailing lists > Public > public-digipub-ig@w3.org > August 2015

Re: "Completeness" as a feature of a POW (aka EPUB+Web)??

From: Bill McCoy <bmccoy@idpf.org>
Date: Thu, 13 Aug 2015 15:25:17 -0700
Message-ID: <CADMjS0bJGOuXW3u3Ed85_vorHWsPZ_MJca_dK5ZG62K+0jZH3A@mail.gmail.com>
To: Leonard Rosenthol <lrosenth@adobe.com>
Cc: Bill McCoy <whmccoy@gmail.com>, "Siegman, Tzviya - Hoboken" <tsiegman@wiley.com>, Bill Kasdorf <bkasdorf@apexcovantage.com>, Ivan Herman <ivan@w3.org>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
Leonard, I thought you and I agreed that "portability" was a spectrum not
black-and-white. On that spectrum, missing fonts to me obviously decreases
the portability of content, to a degree that is dependent on the nature of
the content and font, and how uniquely specified the required font is (if a
URI/URL to find it then that is not "missing" so much as "unpackaged").
This is critical because a missing Amharic font or even more so a missing
Japanese font containing user gaiji,  is highly likely to render a document
unreadable by most people on most systems, so could be considered a
significant "ding" on portability. This is one way in which today's EPUB
differs from arbitrary OWP content, in that we have conformance
requirements that limit define "Publication Resources" to include all
resources (including fonts) where "In the absence of this resource, the
EPUB Publication might not render as intended by the Author", and require
that "All Publication Resources must be listed in the Package Document (as
defined in manifest), adhere to the constraints for Core Media Types and
Fallback and be located as per Publication Resource Locations."

So to me your font example supports my proposed definition of "portable"
even though I agree with you that a missing font is not always a fatal blow
to portability.

As far as scripting, maybe I wasn't communicating clearly as I'm not sure
we disagree. I think it's consistent with the abstract definition of
portability to have an external reference to a script or data in a portable
publication, so long as there's a strong "promise" in some manner that that
result of an HTTP request to the resource (referenced by URI/URL) will
always return the same representation. If OTOH the external reference is to
a Perl program on a server that returns different data/script content
depending on say the phase of the moon, then such an external reference is
not portable. It is this fundamental attribute of the REST architecture of
the Web (the distinction between resource and the infinity of different
representations that a request to that resource might possibly return) that
makes it possible for the online Web change so quickly, but that same
attribute is antithetical to portability.

My mentioning of browser cache was just a side note, just to indicate my
concern that the kind of "promise" I refer to above can't be considered to
be reliably made via HTTP cache-control headers. BTW I believe here in W3C
;-) the right term to use is "User Agent" not "Reading System" and there is
also an argument to be made that EPUB should adjust its current terminology
as it evolves towards closer alignment with the overall OWP (unless W3C
prefers that different instantiations of Web Technologies should stick with
distinct terms, for example "User Agent" for browser instantiations,
"Reading System" for standalone publication instantiations, and
"Application" for installed applications based on Web technologies)... but
maybe it could be better to have "User Agent" mean any instantiation of
OWP, and pick a more descriptive term for the browser-specific


On Thu, Aug 13, 2015 at 2:46 PM, Leonard Rosenthol <lrosenth@adobe.com>

> Sorry but I have to disagree with these definitions, based on actual user
> requirements (at least users we’ve talked to).
> Portable means that all resources are defined. That I don’t need anything
> else to understand the structure of the content.
> However it does not mean that these resources are all available (or up to
> date) at any given time.   Let me use a common PDF case – fonts.  In PDF,
> all fonts must be declared BUT the actual font data need not be embedded
> into the document and instead is provided by the “RS” at runtime.   While
> we might prefer a document that has all fonts embedded, even w/o them the
> document is portable.
> Programmatic elements are a completely different ballgame, IMO, and have
> no bearing on the definition of portability.  To me it doesn’t matter if a
> script (that itself is embedded in the package) refers to embedded
> information or external information AS LONG AS the referencing of that
> information is via a pre-declared reference.
> Leonard
> From: Bill McCoy
> Date: Thursday, August 13, 2015 at 3:24 PM
> To: "Siegman, Tzviya - Hoboken"
> Cc: Bill Kasdorf, Leonard Rosenthol, Ivan Herman, W3C Digital Publishing
> IG, Bill McCoy
> Subject: Re: "Completeness" as a feature of a POW (aka EPUB+Web)??
> proposed concrete definition:
> *portable* (in Web context) means: does not require active server
> infrastructure.
> or to state it another way, *portable* means that all programmatic
> elements are part of and execute in the context of the content
> ("code-on-demand" in Roy Fielding terms).
> --Bill
> On Thu, Aug 13, 2015 at 12:16 PM, Siegman, Tzviya - Hoboken <
> tsiegman@wiley.com> wrote:
>> As for the semantics, we should probably focus on what we mean by
>> "portable," and not get quite so hung up on what we mean by "complete."
>> That is verging very close to the argument about what "is"
>> +100
>> Tzviya Siegman
>> Digital Book Standards & Capabilities Lead
>> Wiley
>> 201-748-6884
>> tsiegman@wiley.com
>> -----Original Message-----
>> From: Bill Kasdorf [mailto:bkasdorf@apexcovantage.com]
>> Sent: Thursday, August 13, 2015 3:00 PM
>> To: Leonard Rosenthol; Ivan Herman
>> Cc: W3C Digital Publishing IG; Bill McCoy
>> Subject: RE: "Completeness" as a feature of a POW (aka EPUB+Web)??
>> The example of the embedded quiz was not what I was considering
>> "dynamic," in the sense of something that is potentially different every
>> time it's accessed (and perhaps whose whole point is that _only_ the
>> current-when-accessed version is what is intended). For the example of the
>> quiz, I would argue that the _quiz itself_ doesn't change, but the answers
>> provided by the student obviously do (ditto any grading provided by a
>> widget). So I would consider the quiz, in that case, legitimately part of
>> the publication, but the answers not to be.
>> Nevertheless, you have a good point, the portable publication may in fact
>> go "fetch" the quiz, or something even simpler like a streaming video. So
>> in those cases I would agree that the quiz or the video, though external
>> resources, _should_ be considered part of the publication, and the
>> publication not to be "complete" without it. I deliberately use quote marks
>> because that appears to me to be the true edge case.
>> If the portable version of the publication contains the widget that
>> provides the quiz, or if the video is embedded, then there isn't really a
>> question.
>> What this is leading me to question is whether a publication needs to be
>> "complete" in order to meet the EPUB+WEB vision in the sense of "identical
>> in all three states (online, cached, portable)." This discussion is leading
>> me to lean toward entertaining the notion that it's perfectly reasonable
>> for "a publication" not to lose its identity just because some resources
>> might need to be retrieved from the network. I realize that may strike
>> folks as a major concession to the vision of EPUB+WEB, but I would point
>> out that the alternative seems to make less intuitive sense. Particularly
>> in education, there will be LOTS of cases where rich functionality is
>> fetched from the network. (Ditto for magazines and news.) It does not seem
>> reasonable to me not to still consider the portable version the same
>> publication, as long as it has the appropriate links and fallbacks.
>> And I guess that gets me at least most of the way back to where I
>> started, which would be to consider the publication complete as long as it
>> contains the links to external resources. But there is definitely a gray
>> area, which then comes down to when the resources are intrinsic to the
>> publication and when they are  not.
>> It's going to be hard to draw a line here. . . . This is potentially one
>> of those issues where we can get wrapped around the axle on the semantics
>> without contributing anything very useful to the solution. I would not want
>> to draw such a strict line on "completeness" that it would undermine the
>> concept of a publication existing in online/cached/portable states, or that
>> it would leave out huge classes of publications like rich textbooks,
>> magazines, etc.
>> As for the semantics, we should probably focus on what we mean by
>> "portable," and not get quite so hung up on what we mean by "complete."
>> That is verging very close to the argument about what "is" is. ;-)
>> --Bill K
>> -----Original Message-----
>> From: Leonard Rosenthol [mailto:lrosenth@adobe.com]
>> Sent: Thursday, August 13, 2015 2:30 PM
>> To: Bill Kasdorf; Ivan Herman
>> Cc: W3C Digital Publishing IG; Bill McCoy
>> Subject: Re: "Completeness" as a feature of a POW (aka EPUB+Web)??
>> I think the second case - dynamic content - is not as clear cut as you
>> make it out to be, Bill.  It is very dependent on how that information is
>> presented - or how much of the publication’s “content” is actually derived
>> from that data.
>> Sure, if we are only talking about a single value on a page (such as a
>> stock price or the current weather), not having it probably wouldn’t impact
>> the understanding of the material or the consumption experience of the
>> publication.  BUT consider something such as an embedded quiz in a
>> textbook, where the questions themselves are coming in live…if they aren’t
>> available (either live or cached) then the student can’t continue.
>> Leonard
>> On 8/13/15, 11:38 AM, "Bill Kasdorf" <bkasdorf@apexcovantage.com> wrote:
>> >Just a quick observation wrt Ivan's example of citations to external
>> publications (my deliberate wording). The papers cited in a journal article
>> (often scores and sometimes hundreds of them) are NOT part of "the
>> publication," they are referenced by the publication. The citations are
>> part of the publication; the cited resources are not.
>> >
>> >We need to be careful, in considering the concept of completeness, to
>> distinguish between whether we are talking about "the publication itself"
>> vs. "the publication and everything else it references."
>> >
>> >This also applies to dynamic content, e.g., a link that fetches
>> up-to-date information (a stock price, the weather in Sydney, comments from
>> other students in my class on what we're studying, etc.).
>> >
>> >In both of those cases, imo, it is reasonable to consider the
>> >publication complete (and, put the other way around, inappropriate to
>> >consider it incomplete) if those links/citations are present, even if
>> >they are not actionable at a given time (e.g., when the portable
>> >version of the publication is consumed offline), and whether or not the
>> >external content has been cached. This is _very_ important for
>> >publications like magazines and news publications. (Be careful to avoid
>> >reflexively thinking "books.")
>> >
>> >--Bill K
>> >
>> >-----Original Message-----
>> >From: Ivan Herman [mailto:ivan@w3.org]
>> >Sent: Thursday, August 13, 2015 12:55 AM
>> >To: Leonard Rosenthol
>> >Cc: W3C Digital Publishing IG; Bill McCoy
>> >Subject: Re: "Completeness" as a feature of a POW (aka EPUB+Web)??
>> >
>> >Leonard,
>> >
>> >good catch, the formulation is indeed not clear. Obviously, there is a
>> need for external link to various things; to take the area of academic
>> publication as an example, such a publication may include references to
>> other papers, it may include references to research data (that may be too
>> large to be included in the document), etc, and it is essential to keep the
>> hyperlink nature of those references. In this sense, "completeness" is not
>> meant to be "fully self-contained".
>> >
>> >I think that Bill's answer[1]:
>> >
>> >"portable documents "promise" a reliable consumption experience without
>> respect of any particular server infrastructure and, especially, without
>> such server infrastructure providing interactivity."
>> >
>> >what I believe we all mean. I am not sure "idempotence"[2], proposed by
>> >Bill, is really the right term, but I do not have a better one at this
>> >point either:-(
>> >
>> >Thanks
>> >
>> >Ivan
>> >
>> >[1]
>> >https://lists.w3.org/Archives/Public/public-digipub-ig/2015Aug/0056.htm
>> >l [2] https://en.wikipedia.org/wiki/Idempotence
>> >
>> >
>> >> On 13 Aug 2015, at 02:31 , Leonard Rosenthol <lrosenth@adobe.com>
>> wrote:
>> >>
>> >> In rewriting the document about Portable Documents for the web (thanks
>> for the suggestion & link, Tzviya), I can across the following paragraph:
>> >>
>> >>> EPUB can be viewed as simply defining a specialization of Web content
>> that assures that a collection of content items has the needed properties
>> of completeness and logical structure, and does so in a standard way that
>> other processing tools and services can reliably create, manipulate, and
>> present such collections. This completeness constraint is key for bridging
>> the current gap between an online and offline/portable view of the same
>> content (see <a href="#whynow">section on usage patterns</a> below).
>> >>>
>> >> While not spelled out here or in the “section on usage patterns”, I am
>> going to take the terminology of “completeness” to mean “fully
>> self-contained” (aka no external references).  If it means something else,
>> feel free to ignore what follows (but only after you correct me :).
>> >>
>> >> In the current use cases for EPUB (books, magazines, etc.), the desire
>> by the publisher to have everything contained inside the package is clearly
>> key – just as that same property has been a tenant of the various PDF
>> subset standards (PDF/A, PDF/X, etc.)  However, there also exists for PDF
>> use cases where external references are a key aspect to the workflow – for
>> example, external content or color profiles in a variable or transactional
>> workflow (eg. PDF/VT).   As such, I would like to suggest that as a
>> portable document for OWP, that there also needs to be a provision for
>> external references in this POW (Portable Open Web) format.
>> >>
>> >> I know that there have been discussions about this around EPUB in the
>> past for large assets (eg. Video and audio), but I would put forth that the
>> same principles could also be applied for other types of content as well.
>> Be it advertisements in a publication, current data sets in a STEM
>> publication or even just a reference to the latest version of a common JS
>> library used by the publication.
>> >>
>> >> What do others think about this?   Is completeness/self-contained a
>> requirement in a POW?
>> >>
>> >> Thanks,
>> >> Leonard
>> >>
>> >
>> >
>> >----
>> >Ivan Herman, W3C
>> >Digital Publishing Activity Lead
>> >Home: http://www.w3.org/People/Ivan/
>> >mobile: +31-641044153
>> >ORCID ID: http://orcid.org/0000-0003-0782-2704
>> >
>> >
>> >
>> >
Received on Thursday, 13 August 2015 22:25:46 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:08 UTC