W3C home > Mailing lists > Public > public-digipub-ig@w3.org > September 2015

Re: [Glossary] Definition of a portable document (and other things...)

From: Ivan Herman <ivan@w3.org>
Date: Wed, 9 Sep 2015 18:24:35 +0200
Cc: W3C Digital Publishing IG <public-digipub-ig@w3.org>, Deborah Kaplan <dkaplan@safaribooksonline.com>, Liam Quin <liam@w3.org>, Leonard Rosenthol <lrosenth@adobe.com>, Ralph Swick <swick@w3.org>, Olaf Drümmer <olaf@druemmer.com>
Message-Id: <508DEECA-A5A1-4A66-BDDD-EF3CE673416D@w3.org>
To: Bill McCoy <bmccoy@idpf.org>
Bill,

just a quick note before I call it a day…

The reason I did not include checksums is because it seems very technology specific, and we avoided that until now. I hear you vs. the "format(s) of all of its constituent Web Resources are finite and enumerable" thing, and I would be happy to extend the Portable Web Document definition to include something except… I am not sure what, because I am not sure that this formulation is the right one. I guess we should try to come up with a better terminology to express this "semi-static" nature without compromising the usage of, say, a javascript that, for example, generate a visual display of fibonacci series that are, potentially, infinite:-).

Maybe somebody will come up with a good terminology while I am sleeping:-)

But we are getting there. If this is nailed down and we can consider these definitions fine, we have still other things to properly nail down, like what it means to be in a portable or cached state, or how do I identify a Portable Web Document with a URI using the FRBR model (or something else:-)

Cheers

Ivan



> On 09 Sep 2015, at 18:10 , Bill McCoy <bmccoy@idpf.org> wrote:
> 
> Ivan,
> 
> I support your revised definition.
> 
> I think there is still something missing re: what Leonard and I were, it seems, agreeing on (!) wrt checksums ... that there is a very high degree of specificity of a particular Portable [Web] Document (which is implicitly provided for PDF and EPUB by the packaging enclosure).
> 
> Basically from your latest definition I would only add to "Portable Web Document" definition that the "format(s) of all of its constituent Web Resources are finite and enumerable" or something to that effect. Basically if the formats of resources are dynamic, then the result (to me) cannot be considered "portable" , because to me "delivery of essential content and functionality" is squishy, and too low a bar for archival and multi-channel distribution use cases. Portability to me is essentially mechanical - it means you can move stuff from place to place, server to server, server to client, client to server, archive it offline, etc.
> 
> And this arguably supports Deborah's suggestion to separately define "Portable Web Resource" - but in my model, that would be any Web Resource whose format(s) are finite and enumerable. I.e.  almost the same as "static" content (although the representations could be produced by a server-side program, for example retrieved from a CMS DB, since they are finite and enumerable that server-side program is by definition not essential). Then "Portable Web Document" is just a "Web Document" comprised of "Portable Web Resources", nothing else need be said (the graceful degradation part is just a logical consequence).
> 
> I just don't see why a "Portable Web Document" should be less precisely specified than a "Portable Document" that is not Web-based, and clearly these (in both PDF and EPUB forms, really in any packaged format) offer that stronger guarantee.
> 
> But maybe I'm in the minority on this, if so, again I support your revised definition as an improvement. If the group does want to bless a squishier definition of "Portable Web Document" then I perhaps we could also choose a term to define what I am talking about - something more precisely nailed down. I'm just not sure why we need both.
> 
> --Bill
> 
> 
> 
> On Wed, Sep 9, 2015 at 5:43 AM, Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>> wrote:
> Hi everybody,
> 
> I again try to play the role providing summaries:-) The fact that I am on the other side of the pond compared to most of you means that I get a whole lot of emails in the morning, so I can do it...
> 
> As before, I will try to come up with my synthesis for the next round of discussions. I started with Deborah's proposal[1] which seems to summarize many points up to that point. Let me give my slightly different version, and give my comments below. I am a little bit bothered that this definition becomes way longer than what I summarized last time[2], but maybe this is just the nature of the beast...
> 
> Here it is:
> 
> [[[
> * A **Web Resource** is a digital resource that can be uniquely addressed by a Unified Resource Identifier (URI), and whose content can be accessed through standard protocols like HTTP, FTP, etc.
> 
> * **Essential content** of a Web Resource: if removed, would fundamentally change the information or functionality of the content.
> 
> * **Functionality** related to a Web Resource: processes and outcomes achievable through user action.
> 
> * A **Web Document** is a Web Resource which itself is a collated set of interrelated Web Resources and which is intended to be seen as a single Web Resource
> 	* A Web Document *should* be constructed of resources whose formats enable (individually or in conjunction with other resources in the same Web Document) delivery of essential content and functionality when delivered via a variety of technologies or delivery platforms.
> 	* A Web Document *should* provide a gracefully degrading experience when delivered via a variety of technologies or delivery platforms.
> 	* A Web Document *should* provide accessible access to content.
> 	* A Web Document is *not* an object with a precise technical meaning, e.g., it is not equivalent to an HTML Document.
> 
> * A **Portable Web Document** is a Web Document which contains, within its constituent set, the information necessary to provide delivery of essential content and functionality, or a graceful degradation thereof, without the presence of any other Web Resources.
> ]]]
> 
> And here are my comments on a number of points, a bit in an unorderly manner:
> 
> * I agree with Leonard's comment on [1] that an explicit reference to WCAG is not appropriate in a definition. There may be resources that the WCAG does not address, it may be a moving target with different versions, and we try to keep away from specific technologies anyway.
> 
> * I also agree with Leonard that the 'graceful degradation' aspect at delivery of a portable resource is essential and we should not remove it from the definitions. In fact, it may be considered to be in [1] (looking at the term of essential content and functionality) but it does not harm to make it explicit.
> 
> * The reason why RDF1.1 (that Deborah referred to) has greatly reduced the complexity of its definition of a resource was, if I remember well, pure pragmatism. From an RDF point of view the fact that it has a unique identifier in terms of a URI is all that counts. Any attempt to give a more precise meaning may (ehem, does...) lead to an infinite amount of discussions. I think this pragmatism is a good idea here, too. Actually, I tried to restrict the terminology even further by referring to Web resources; in the RDF model, *anything* can be a resource (including natural persons like Ivan Herman), and we should not go there imho. On the other hand, having a definition for what we mean by a resource sounded like a good idea, so I added that.
> 
> * There is a major discussion coming up later: a URI is not necessarily a URL. It may be a of course a URL, but can also be a URN, which then includes DOI-s, ISBN-s, etc. This will require a much finer set of definitions (or find them in the literature) because, obviously, DOI-s or ISBN should be usable to identify a Web Document. The list of terms on the initial glossary [3] includes the document identifier as a term to be defined. I would propose *not* to get into this particular discussion for now; it is on the list of the terms to define, but let us take one step at a time... (B.t.w., the checksum idea, raised by Bill & Leonard, may come back at that point.)
> 
> * I was not sure about the choice among 'curate', 'collate', etc. I am sensitive to Bill's arguments, so I have taken 'collate'.
> 
> * Maybe the biggest departure of Deborah's definition: I must admit I was not convinced by the necessity of having a separate definition of a 'Portable Resource'. I did not see what it brings us...
> 
> * I added "delivery platforms"-s to the should-s, to make it clear that we also include, eg, different types of displays. I had the impression in the thread that we were too focussed on accessibility issues and we did not really consider other types of access problems. It may be unnecessary, though, I am sure someone will tell me:-)
> 
> * The original text had: "identified as a single document by the curator." I was not sure about this formulation, and we also dropped 'curation'. I went back to one of Olaf's mail who emphasized the 'intention' of combining the resources into one Web Document instead. I think the intention is fairly similar, it just sounded better:-)
> 
> * I hesitated between Deborah's proposal on "presence of any other digital content" and Olaf's "independent of any specific infrastructure". I stayed by the former because it seemed to be more generic...
> 
> * I tried to use only the term "web resource' everywhere and not use the term digital content. Just to be consistent...
> 
> Gonggg... Third round! :-)
> 
> Ivan
> 
> 
> [1] http://www.w3.org/mid/alpine.WNT.2.00.1509081456590.5472@DKaplan.safarijv.com <http://www.w3.org/mid/alpine.WNT.2.00.1509081456590.5472@DKaplan.safarijv.com>
> [2] http://www.w3.org/mid/E51A8C8A-FD5B-4BB3-B7EA-38B94AC4736F@w3.org <http://www.w3.org/mid/E51A8C8A-FD5B-4BB3-B7EA-38B94AC4736F@w3.org>
> [3] https://www.w3.org/dpub/IG/wiki/Glossary <https://www.w3.org/dpub/IG/wiki/Glossary>
> 
> 
> 
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
> mobile: +31-641044153 <tel:%2B31-641044153>
> ORCID ID: http://orcid.org/0000-0003-0782-2704 <http://orcid.org/0000-0003-0782-2704>
> 
> 
> 
> 
> 
> 
> 
> --
> 
> Bill McCoy
> Executive Director
> International Digital Publishing Forum (IDPF)
> email: bmccoy@idpf.org <mailto:bmccoy@idpf.org>
> mobile: +1 206 353 0233
> 


----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704





Received on Wednesday, 9 September 2015 16:24:50 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:12 UTC