W3C home > Mailing lists > Public > public-digipub-ig@w3.org > August 2015

Re: [DPUB] packaging requirements document

From: Leonard Rosenthol <lrosenth@adobe.com>
Date: Wed, 19 Aug 2015 15:08:48 +0000
To: Ivan Herman <ivan@w3.org>
CC: Tzviya Siegman <tsiegman@wiley.com>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-ID: <4E1CDE83-1158-49CB-BDEC-FE8B8231F528@adobe.com>
On 8/19/15, 10:42 AM, "Ivan Herman" <ivan@w3.org> wrote:

>I was fairly busy with other things today, so I could not spend too much time on this. I have some responses (and possible actions on the documents) below, but I cannot promise to take care of all of them now. To be continued tomorrow, if needed…

No problem - just wanted to make sure we delivered our document in a timely manner…

>>On 18 Aug 2015, at 18:10 , Leonard Rosenthol <lrosenth@adobe.com> wrote:
>>– Regardless of the fact that someone at the IETF thinks “archive” is the right term, in the document/publication space it is NOT.  I would strongly recommend that we NOT refer to that document or that terminology.
>During the discussion on the mailing list we were asked to put a concise definition for a package into the document. (I believe what IETF considered as archive in their exploration for providing a top level media type for packages is actually of a similar goal.) Do you have a beter replacement?

I think “package” is the correct term, not archive.  I have reached out to the IETF to get them to change as well.

>>- I have problems with this phrase “ This is, however, different from the cached state of a networked publication, which does not have a separate existence (though can also be used offline).”.  There are many ways to cache, some of which are related to browser-based technology and some of which are not.  But all of which constitute the concept of a “cached and offline” document.   How about just removing this.  I don’t think it adds anything, certainly not at this point in the document.
>The text (tries to) refer to browser based caches here. 

And my point is that it should not do so, because there is no requirement that ONLY browser-based caches be used as part of the process of caching and/or taking a publication offline.  There is also no requirement that the cached state and the portable state be different.  I believe that it is important that this document be agnostic to the specific technology choices and focus on the goals and requirements.

>Do you have a better way of formulating this?

I would just remove the sentence entirely as it adds nothing.

>>Right, this is a bit more complicated. What I think was meant is that the rendering and possibly interactive part of the reading system independent of the state, ie, the change on that is indeed transparent.

Yes, I agree that the content should look/act the same independent of state.  Just say something that like :).

>>- The phrase “ It should maintain its integrity over time” isn’t actually something that we, as the file format specification, have any control over. It is more about the media, systems, etc. in which the content is stored.  As such, it should be removed.
>Hm. If I reboot my machine, the cache will disappear, but a portable document on my disc will remain. I am not sure what the problem is with this.

What you talk about is persistence, not integrity.  Integrity has to do with reliability and robustness, which are more tied to things such as media stability, data validation/checksumming, etc. 

And actually, there is nothing in the requirements that state that the cache goes away on a reboot.  That would be a specific implementation decision.

>>- Are there no other requirements for the portable state?  I believe we had some in our existing use case/requirements specs.   If not, I can think of a few that I would add here.
>I would very welcome that.

Here are a few…

- Ability to distribute the publication via non-real-time methods ranging from email to sneaker-net
- Ability to read the publication behind a firewall or secured network
- Ability to perform preflight & validation on a stable set of content

>>This is something I will have to think more about. The issue is that the streamability may make something different depending on the state. 

Again, I think we have a terminology problem here. What you describe is the ability to get live (or updated) data/content which is a completely different requirement than streamability.  The ability to stream is well defined in the first sentence - “ It must be possible for a client to fetch components of a package in any order, or to fetch multiple components at the same time, without having to read the entire document”.  That has NOTHING to do with being connected or offline - it has to do how the RS is able to access to the content.

If you want to also add a separate requirement around the ability for the package to be able to specify that specific pieces of content (assets) within the package are not necessarily embedded but instead are retrieved live (with optional caching) - I think that would be a welcome requirement.  Or you could just merge this with "Updates new components only”?

>>- In the Package in a Package section, you have “This is trivially available in online and cached states, but puts an extra requirement on portable states.”  This appears to be a copy/paste from elsewhere, as it doesn’t belong here because it’s simply not true in this case.  Please remove.
>It is a copy paste indeed, but is it incorrect? (It may be superfluous, though).

Given that we don’t actually know what an “online state”, a “cached state” or even a “portable state” look like from a technical perspective - it is impossible to make any comment on the ability to implement such.

>>- The Access to package section also has a similar note about “trivially available” which is also not true, and I would recommend removal as well.
>I have re-written that sentence in a way that, I believe, is correct…

I don’t see the change yet, but as with the previous statement - I don’t see how we can make any comments on implementation complexity until we know what is being implemented.


Thank you for taking the time to review my comments.

Received on Wednesday, 19 August 2015 15:09:20 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:08 UTC