Re: "Completeness" as a feature of a POW (aka EPUB+Web)??

You and I do agree that portability is a spectrum, but not sure if all others do.  I wanted to put forth some examples along that spectrum.   And I agree that for an EPUB, requiring embedded fonts is reasonable – but not so for arbitrary OWP content.  And it adds to the problems that are going to arise in trying to use EPUB for both use cases – and still be able to have either existing or future UA/RS handle both (assuming that such is even a goal)

On the other hand, we do quite disagree on the scripting and referenced data example.  I would consider a document that could fetch “changing values” as portable – provided that the URI from which the data is fetched is declared inside the package (and therefore unchanged).  And to me, this is where caching comes in – to be able to store the last used value so that something is always displayed to the user.  I also see this as a place where service/web workers could come into play…

Now the above said – I would not expect to see such a thing in a “document” but I would expect to have it in “packaged OWP”…. And thus another split between the goals of EPUB?!?!

Leonard

From: Bill McCoy
Date: Thursday, August 13, 2015 at 6:25 PM
To: Leonard Rosenthol
Cc: Bill McCoy, "Siegman, Tzviya - Hoboken", Bill Kasdorf, Ivan Herman, W3C Digital Publishing IG
Subject: Re: "Completeness" as a feature of a POW (aka EPUB+Web)??

Leonard, I thought you and I agreed that "portability" was a spectrum not black-and-white. On that spectrum, missing fonts to me obviously decreases the portability of content, to a degree that is dependent on the nature of the content and font, and how uniquely specified the required font is (if a URI/URL to find it then that is not "missing" so much as "unpackaged"). This is critical because a missing Amharic font or even more so a missing Japanese font containing user gaiji,  is highly likely to render a document unreadable by most people on most systems, so could be considered a significant "ding" on portability. This is one way in which today's EPUB differs from arbitrary OWP content, in that we have conformance requirements that limit define "Publication Resources" to include all resources (including fonts) where "In the absence of this resource, the EPUB Publication might not render as intended by the Author", and require that "All Publication Resources must be listed in the Package Document (as defined in manifest), adhere to the constraints for Core Media Types and Fallback and be located as per Publication Resource Locations."

So to me your font example supports my proposed definition of "portable" even though I agree with you that a missing font is not always a fatal blow to portability.

As far as scripting, maybe I wasn't communicating clearly as I'm not sure we disagree. I think it's consistent with the abstract definition of portability to have an external reference to a script or data in a portable publication, so long as there's a strong "promise" in some manner that that result of an HTTP request to the resource (referenced by URI/URL) will always return the same representation. If OTOH the external reference is to a Perl program on a server that returns different data/script content depending on say the phase of the moon, then such an external reference is not portable. It is this fundamental attribute of the REST architecture of the Web (the distinction between resource and the infinity of different representations that a request to that resource might possibly return) that makes it possible for the online Web change so quickly, but that same attribute is antithetical to portability.

My mentioning of browser cache was just a side note, just to indicate my concern that the kind of "promise" I refer to above can't be considered to be reliably made via HTTP cache-control headers. BTW I believe here in W3C ;-) the right term to use is "User Agent" not "Reading System" and there is also an argument to be made that EPUB should adjust its current terminology as it evolves towards closer alignment with the overall OWP (unless W3C prefers that different instantiations of Web Technologies should stick with distinct terms, for example "User Agent" for browser instantiations, "Reading System" for standalone publication instantiations, and "Application" for installed applications based on Web technologies)... but maybe it could be better to have "User Agent" mean any instantiation of OWP, and pick a more descriptive term for the browser-specific instantiation).

--Bill

On Thu, Aug 13, 2015 at 2:46 PM, Leonard Rosenthol <lrosenth@adobe.com<mailto:lrosenth@adobe.com>> wrote:
Sorry but I have to disagree with these definitions, based on actual user requirements (at least users we’ve talked to).

Portable means that all resources are defined. That I don’t need anything else to understand the structure of the content.
However it does not mean that these resources are all available (or up to date) at any given time.   Let me use a common PDF case – fonts.  In PDF, all fonts must be declared BUT the actual font data need not be embedded into the document and instead is provided by the “RS” at runtime.   While we might prefer a document that has all fonts embedded, even w/o them the document is portable.

Programmatic elements are a completely different ballgame, IMO, and have no bearing on the definition of portability.  To me it doesn’t matter if a script (that itself is embedded in the package) refers to embedded information or external information AS LONG AS the referencing of that information is via a pre-declared reference.

Leonard

From: Bill McCoy
Date: Thursday, August 13, 2015 at 3:24 PM
To: "Siegman, Tzviya - Hoboken"
Cc: Bill Kasdorf, Leonard Rosenthol, Ivan Herman, W3C Digital Publishing IG, Bill McCoy

Subject: Re: "Completeness" as a feature of a POW (aka EPUB+Web)??

proposed concrete definition:

portable (in Web context) means: does not require active server infrastructure.

or to state it another way, portable means that all programmatic elements are part of and execute in the context of the content ("code-on-demand" in Roy Fielding terms).

--Bill

On Thu, Aug 13, 2015 at 12:16 PM, Siegman, Tzviya - Hoboken <tsiegman@wiley.com<mailto:tsiegman@wiley.com>> wrote:
As for the semantics, we should probably focus on what we mean by "portable," and not get quite so hung up on what we mean by "complete." That is verging very close to the argument about what "is"

+100

Tzviya Siegman
Digital Book Standards & Capabilities Lead
Wiley
201-748-6884<tel:201-748-6884>
tsiegman@wiley.com<mailto:tsiegman@wiley.com>


-----Original Message-----
From: Bill Kasdorf [mailto:bkasdorf@apexcovantage.com<mailto:bkasdorf@apexcovantage.com>]
Sent: Thursday, August 13, 2015 3:00 PM
To: Leonard Rosenthol; Ivan Herman
Cc: W3C Digital Publishing IG; Bill McCoy
Subject: RE: "Completeness" as a feature of a POW (aka EPUB+Web)??

The example of the embedded quiz was not what I was considering "dynamic," in the sense of something that is potentially different every time it's accessed (and perhaps whose whole point is that _only_ the current-when-accessed version is what is intended). For the example of the quiz, I would argue that the _quiz itself_ doesn't change, but the answers provided by the student obviously do (ditto any grading provided by a widget). So I would consider the quiz, in that case, legitimately part of the publication, but the answers not to be.

Nevertheless, you have a good point, the portable publication may in fact go "fetch" the quiz, or something even simpler like a streaming video. So in those cases I would agree that the quiz or the video, though external resources, _should_ be considered part of the publication, and the publication not to be "complete" without it. I deliberately use quote marks because that appears to me to be the true edge case.

If the portable version of the publication contains the widget that provides the quiz, or if the video is embedded, then there isn't really a question.

What this is leading me to question is whether a publication needs to be "complete" in order to meet the EPUB+WEB vision in the sense of "identical in all three states (online, cached, portable)." This discussion is leading me to lean toward entertaining the notion that it's perfectly reasonable for "a publication" not to lose its identity just because some resources might need to be retrieved from the network. I realize that may strike folks as a major concession to the vision of EPUB+WEB, but I would point out that the alternative seems to make less intuitive sense. Particularly in education, there will be LOTS of cases where rich functionality is fetched from the network. (Ditto for magazines and news.) It does not seem reasonable to me not to still consider the portable version the same publication, as long as it has the appropriate links and fallbacks.

And I guess that gets me at least most of the way back to where I started, which would be to consider the publication complete as long as it contains the links to external resources. But there is definitely a gray area, which then comes down to when the resources are intrinsic to the publication and when they are  not.

It's going to be hard to draw a line here. . . . This is potentially one of those issues where we can get wrapped around the axle on the semantics without contributing anything very useful to the solution. I would not want to draw such a strict line on "completeness" that it would undermine the concept of a publication existing in online/cached/portable states, or that it would leave out huge classes of publications like rich textbooks, magazines, etc.

As for the semantics, we should probably focus on what we mean by "portable," and not get quite so hung up on what we mean by "complete." That is verging very close to the argument about what "is" is. ;-)

--Bill K

-----Original Message-----
From: Leonard Rosenthol [mailto:lrosenth@adobe.com<mailto:lrosenth@adobe.com>]
Sent: Thursday, August 13, 2015 2:30 PM
To: Bill Kasdorf; Ivan Herman
Cc: W3C Digital Publishing IG; Bill McCoy
Subject: Re: "Completeness" as a feature of a POW (aka EPUB+Web)??

I think the second case - dynamic content - is not as clear cut as you make it out to be, Bill.  It is very dependent on how that information is presented - or how much of the publication’s “content” is actually derived from that data.

Sure, if we are only talking about a single value on a page (such as a stock price or the current weather), not having it probably wouldn’t impact the understanding of the material or the consumption experience of the publication.  BUT consider something such as an embedded quiz in a textbook, where the questions themselves are coming in live…if they aren’t available (either live or cached) then the student can’t continue.

Leonard



On 8/13/15, 11:38 AM, "Bill Kasdorf" <bkasdorf@apexcovantage.com<mailto:bkasdorf@apexcovantage.com>> wrote:

>Just a quick observation wrt Ivan's example of citations to external publications (my deliberate wording). The papers cited in a journal article (often scores and sometimes hundreds of them) are NOT part of "the publication," they are referenced by the publication. The citations are part of the publication; the cited resources are not.
>
>We need to be careful, in considering the concept of completeness, to distinguish between whether we are talking about "the publication itself" vs. "the publication and everything else it references."
>
>This also applies to dynamic content, e.g., a link that fetches up-to-date information (a stock price, the weather in Sydney, comments from other students in my class on what we're studying, etc.).
>
>In both of those cases, imo, it is reasonable to consider the
>publication complete (and, put the other way around, inappropriate to
>consider it incomplete) if those links/citations are present, even if
>they are not actionable at a given time (e.g., when the portable
>version of the publication is consumed offline), and whether or not the
>external content has been cached. This is _very_ important for
>publications like magazines and news publications. (Be careful to avoid
>reflexively thinking "books.")
>
>--Bill K
>
>-----Original Message-----
>From: Ivan Herman [mailto:ivan@w3.org<mailto:ivan@w3.org>]
>Sent: Thursday, August 13, 2015 12:55 AM
>To: Leonard Rosenthol
>Cc: W3C Digital Publishing IG; Bill McCoy
>Subject: Re: "Completeness" as a feature of a POW (aka EPUB+Web)??
>
>Leonard,
>
>good catch, the formulation is indeed not clear. Obviously, there is a need for external link to various things; to take the area of academic publication as an example, such a publication may include references to other papers, it may include references to research data (that may be too large to be included in the document), etc, and it is essential to keep the hyperlink nature of those references. In this sense, "completeness" is not meant to be "fully self-contained".
>
>I think that Bill's answer[1]:
>
>"portable documents "promise" a reliable consumption experience without respect of any particular server infrastructure and, especially, without such server infrastructure providing interactivity."
>
>what I believe we all mean. I am not sure "idempotence"[2], proposed by
>Bill, is really the right term, but I do not have a better one at this
>point either:-(
>
>Thanks
>
>Ivan
>
>[1]
>https://lists.w3.org/Archives/Public/public-digipub-ig/2015Aug/0056.htm

>l [2] https://en.wikipedia.org/wiki/Idempotence

>
>
>> On 13 Aug 2015, at 02:31 , Leonard Rosenthol <lrosenth@adobe.com<mailto:lrosenth@adobe.com>> wrote:
>>
>> In rewriting the document about Portable Documents for the web (thanks for the suggestion & link, Tzviya), I can across the following paragraph:
>>
>>> EPUB can be viewed as simply defining a specialization of Web content that assures that a collection of content items has the needed properties of completeness and logical structure, and does so in a standard way that other processing tools and services can reliably create, manipulate, and present such collections. This completeness constraint is key for bridging the current gap between an online and offline/portable view of the same content (see <a href="#whynow">section on usage patterns</a> below).
>>>
>> While not spelled out here or in the “section on usage patterns”, I am going to take the terminology of “completeness” to mean “fully self-contained” (aka no external references).  If it means something else, feel free to ignore what follows (but only after you correct me :).
>>
>> In the current use cases for EPUB (books, magazines, etc.), the desire by the publisher to have everything contained inside the package is clearly key – just as that same property has been a tenant of the various PDF subset standards (PDF/A, PDF/X, etc.)  However, there also exists for PDF use cases where external references are a key aspect to the workflow – for example, external content or color profiles in a variable or transactional workflow (eg. PDF/VT).   As such, I would like to suggest that as a portable document for OWP, that there also needs to be a provision for external references in this POW (Portable Open Web) format.
>>
>> I know that there have been discussions about this around EPUB in the past for large assets (eg. Video and audio), but I would put forth that the same principles could also be applied for other types of content as well.  Be it advertisements in a publication, current data sets in a STEM publication or even just a reference to the latest version of a common JS library used by the publication.
>>
>> What do others think about this?   Is completeness/self-contained a requirement in a POW?
>>
>> Thanks,
>> Leonard
>>
>
>
>----
>Ivan Herman, W3C
>Digital Publishing Activity Lead
>Home: http://www.w3.org/People/Ivan/

>mobile: +31-641044153<tel:%2B31-641044153>
>ORCID ID: http://orcid.org/0000-0003-0782-2704

>
>
>
>

Received on Thursday, 13 August 2015 22:40:59 UTC