W3C home > Mailing lists > Public > public-digipub-ig@w3.org > September 2015

Re: [Glossary] Definition of a portable document (and other things...)

From: Bill McCoy <bmccoy@idpf.org>
Date: Wed, 9 Sep 2015 09:35:59 -0700
Message-ID: <CADMjS0aV3TnSp9WwM=+sM2uX082MH3Zw_GUc+oCTQE72U8tn7w@mail.gmail.com>
To: Ivan Herman <ivan@w3.org>
Cc: W3C Digital Publishing IG <public-digipub-ig@w3.org>, Deborah Kaplan <dkaplan@safaribooksonline.com>, Liam Quin <liam@w3.org>, Leonard Rosenthol <lrosenth@adobe.com>, Ralph Swick <swick@w3.org>, Olaf Drümmer <olaf@druemmer.com>
Yes I did not actually propose to use checksums in the definition, that was
a motivating thought experiment only. And I am not particularly attached to
my "formats are finite and enumerable" language either and would welcome
something better (I also thought of the term "knowable" but decided against
it lest we delve too far into epistemology).

But to be clear on what I meant I want to drill down on your example.

If said JavaScript program is to be run on the receiving system, aka user
agent aka reading system, then in REST architectural style terms, this what
Roy Fielding defined as "code-on-demand". In this case the *format* of the
resource is the JS code itself not the side-effects of its execution - so
sure it can be portable per my definition and usually would be (unless the
code itself were dynamically generated by another program, or referenced in
a loose way by URI that is not specific as to version etc.). That there are
infinite potential results of executing that code on the user agent doesn't
impair portability and in fact to me it is precisely this that makes a
portable web document special with respect to interactivity vs. an
arbitrary Web app - a portable web document can be highly interactive, but
all the interactive elements must itself be portable. OTOH if that
JavaScript code in your example was designed to run only on the server via
e.g. Node.js, then no the resulting Web document would not (in my view) be
portable at all (well, since portability is not binary, the dependency on
such server-side logic would be a ding to that document's portability
rating, which might be more or less determinative depending on how
important that dynamic Fibonacci display is to the document as a whole).


On Wed, Sep 9, 2015 at 9:24 AM, Ivan Herman <ivan@w3.org> wrote:

> Bill,
> just a quick note before I call it a day…
> The reason I did not include checksums is because it seems very technology
> specific, and we avoided that until now. I hear you vs. the "*format(s)
> of all of its constituent Web Resources are finite and enumerable"*
> thing, and I would be happy to extend the Portable Web Document definition
> to include something except… I am not sure what, because I am not sure that
> this formulation is the right one. I guess we should try to come up with a
> better terminology to express this "semi-static" nature without
> compromising the usage of, say, a javascript that, for example, generate a
> visual display of fibonacci series that are, potentially, infinite:-).
> Maybe somebody will come up with a good terminology while I am sleeping:-)
> But we are getting there. If this is nailed down and we can consider these
> definitions fine, we have still other things to properly nail down, like
> what it means to be in a portable or cached state, or how do I identify a
> Portable Web Document with a URI using the FRBR model (or something else:-)
> Cheers
> Ivan
> On 09 Sep 2015, at 18:10 , Bill McCoy <bmccoy@idpf.org> wrote:
> Ivan,
> I support your revised definition.
> I think there is still something missing re: what Leonard and I were, it
> seems, agreeing on (!) wrt checksums ... that there is a very high degree
> of specificity of a particular Portable [Web] Document (which is implicitly
> provided for PDF and EPUB by the packaging enclosure).
> Basically from your latest definition I would only add to "Portable Web
> Document" definition that the "*format(s) of all of its constituent Web
> Resources are finite and enumerable*" or something to that effect.
> Basically if the formats of resources are dynamic, then the result (to me)
> cannot be considered "portable" , because to me "delivery of essential
> content and functionality" is squishy, and too low a bar for archival and
> multi-channel distribution use cases. Portability to me is essentially
> *mechanical* - it means you can move stuff from place to place, server to
> server, server to client, client to server, archive it offline, etc.
> And this arguably supports Deborah's suggestion to separately define
> "Portable Web Resource" - but in my model, that would be any Web Resource
> whose format(s) are finite and enumerable. I.e.  almost the same as
> "static" content (although the representations could be produced by a
> server-side program, for example retrieved from a CMS DB, since they are
> finite and enumerable that server-side program is by definition not
> essential). Then "Portable Web Document" is just a "Web Document" comprised
> of "Portable Web Resources", nothing else need be said (the graceful
> degradation part is just a logical consequence).
> I just don't see why a "Portable Web Document" should be less precisely
> specified than a "Portable Document" that is not Web-based, and clearly
> these (in both PDF and EPUB forms, really in any packaged format) offer
> that stronger guarantee.
> But maybe I'm in the minority on this, if so, again I support your revised
> definition as an improvement. If the group does want to bless a squishier
> definition of "Portable Web Document" then I perhaps we could also choose a
> term to define what I am talking about - something more precisely nailed
> down. I'm just not sure why we need both.
> --Bill
> On Wed, Sep 9, 2015 at 5:43 AM, Ivan Herman <ivan@w3.org> wrote:
>> Hi everybody,
>> I again try to play the role providing summaries:-) The fact that I am on
>> the other side of the pond compared to most of you means that I get a whole
>> lot of emails in the morning, so I can do it...
>> As before, I will try to come up with my synthesis for the next round of
>> discussions. I started with Deborah's proposal[1] which seems to summarize
>> many points up to that point. Let me give my slightly different version,
>> and give my comments below. I am a little bit bothered that this definition
>> becomes way longer than what I summarized last time[2], but maybe this is
>> just the nature of the beast...
>> Here it is:
>> [[[
>> * A **Web Resource** is a digital resource that can be uniquely addressed
>> by a Unified Resource Identifier (URI), and whose content can be accessed
>> through standard protocols like HTTP, FTP, etc.
>> * **Essential content** of a Web Resource: if removed, would
>> fundamentally change the information or functionality of the content.
>> * **Functionality** related to a Web Resource: processes and outcomes
>> achievable through user action.
>> * A **Web Document** is a Web Resource which itself is a collated set of
>> interrelated Web Resources and which is intended to be seen as a single Web
>> Resource
>> * A Web Document *should* be constructed of resources whose formats
>> enable (individually or in conjunction with other resources in the same Web
>> Document) delivery of essential content and functionality when delivered
>> via a variety of technologies or delivery platforms.
>> * A Web Document *should* provide a gracefully degrading experience when
>> delivered via a variety of technologies or delivery platforms.
>> * A Web Document *should* provide accessible access to content.
>> * A Web Document is *not* an object with a precise technical meaning,
>> e.g., it is not equivalent to an HTML Document.
>> * A **Portable Web Document** is a Web Document which contains, within
>> its constituent set, the information necessary to provide delivery of
>> essential content and functionality, or a graceful degradation thereof,
>> without the presence of any other Web Resources.
>> ]]]
>> And here are my comments on a number of points, a bit in an unorderly
>> manner:
>> * I agree with Leonard's comment on [1] that an explicit reference to
>> WCAG is not appropriate in a definition. There may be resources that the
>> WCAG does not address, it may be a moving target with different versions,
>> and we try to keep away from specific technologies anyway.
>> * I also agree with Leonard that the 'graceful degradation' aspect at
>> delivery of a portable resource is essential and we should not remove it
>> from the definitions. In fact, it may be considered to be in [1] (looking
>> at the term of essential content and functionality) but it does not harm to
>> make it explicit.
>> * The reason why RDF1.1 (that Deborah referred to) has greatly reduced
>> the complexity of its definition of a resource was, if I remember well,
>> pure pragmatism. From an RDF point of view the fact that it has a unique
>> identifier in terms of a URI is all that counts. Any attempt to give a more
>> precise meaning may (ehem, does...) lead to an infinite amount of
>> discussions. I think this pragmatism is a good idea here, too. Actually, I
>> tried to restrict the terminology even further by referring to Web
>> resources; in the RDF model, *anything* can be a resource (including
>> natural persons like Ivan Herman), and we should not go there imho. On the
>> other hand, having a definition for what we mean by a resource sounded like
>> a good idea, so I added that.
>> * There is a major discussion coming up later: a URI is not necessarily a
>> URL. It may be a of course a URL, but can also be a URN, which then
>> includes DOI-s, ISBN-s, etc. This will require a much finer set of
>> definitions (or find them in the literature) because, obviously, DOI-s or
>> ISBN should be usable to identify a Web Document. The list of terms on the
>> initial glossary [3] includes the document identifier as a term to be
>> defined. I would propose *not* to get into this particular discussion for
>> now; it is on the list of the terms to define, but let us take one step at
>> a time... (B.t.w., the checksum idea, raised by Bill & Leonard, may come
>> back at that point.)
>> * I was not sure about the choice among 'curate', 'collate', etc. I am
>> sensitive to Bill's arguments, so I have taken 'collate'.
>> * Maybe the biggest departure of Deborah's definition: I must admit I was
>> not convinced by the necessity of having a separate definition of a
>> 'Portable Resource'. I did not see what it brings us...
>> * I added "delivery platforms"-s to the should-s, to make it clear that
>> we also include, eg, different types of displays. I had the impression in
>> the thread that we were too focussed on accessibility issues and we did not
>> really consider other types of access problems. It may be unnecessary,
>> though, I am sure someone will tell me:-)
>> * The original text had: "identified as a single document by the
>> curator." I was not sure about this formulation, and we also dropped
>> 'curation'. I went back to one of Olaf's mail who emphasized the
>> 'intention' of combining the resources into one Web Document instead. I
>> think the intention is fairly similar, it just sounded better:-)
>> * I hesitated between Deborah's proposal on "presence of any other
>> digital content" and Olaf's "independent of any specific infrastructure". I
>> stayed by the former because it seemed to be more generic...
>> * I tried to use only the term "web resource' everywhere and not use the
>> term digital content. Just to be consistent...
>> Gonggg... Third round! :-)
>> Ivan
>> [1]
>> http://www.w3.org/mid/alpine.WNT.2.00.1509081456590.5472@DKaplan.safarijv.com
>> [2] http://www.w3.org/mid/E51A8C8A-FD5B-4BB3-B7EA-38B94AC4736F@w3.org
>> [3] https://www.w3.org/dpub/IG/wiki/Glossary
>> ----
>> Ivan Herman, W3C
>> Digital Publishing Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> ORCID ID: http://orcid.org/0000-0003-0782-2704
> --
> Bill McCoy
> Executive Director
> International Digital Publishing Forum (IDPF)
> email: bmccoy@idpf.org
> mobile: +1 206 353 0233
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704


Bill McCoy
Executive Director
International Digital Publishing Forum (IDPF)
email: bmccoy@idpf.org
mobile: +1 206 353 0233
Received on Wednesday, 9 September 2015 16:36:34 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:12 UTC