W3C home > Mailing lists > Public > public-digipub-ig@w3.org > August 2015

Re: [DPUB] packaging requirements document

From: Ivan Herman <ivan@w3.org>
Date: Thu, 20 Aug 2015 11:10:13 +0200
Cc: Tzviya Siegman <tsiegman@wiley.com>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-Id: <4846D063-7F58-4BAA-85CF-760FD17E1138@w3.org>
To: Leonard Rosenthol <lrosenth@adobe.com>

> On 19 Aug 2015, at 17:08 , Leonard Rosenthol <lrosenth@adobe.com> wrote:
> 
> On 8/19/15, 10:42 AM, "Ivan Herman" <ivan@w3.org> wrote:
> 
> 
> 
>> I was fairly busy with other things today, so I could not spend too much time on this. I have some responses (and possible actions on the documents) below, but I cannot promise to take care of all of them now. To be continued tomorrow, if needed…
> 
> No problem - just wanted to make sure we delivered our document in a timely manner…
> 
> 
>>> On 18 Aug 2015, at 18:10 , Leonard Rosenthol <lrosenth@adobe.com> wrote:
>>> 
>>> – Regardless of the fact that someone at the IETF thinks “archive” is the right term, in the document/publication space it is NOT.  I would strongly recommend that we NOT refer to that document or that terminology.
>> 
>> During the discussion on the mailing list we were asked to put a concise definition for a package into the document. (I believe what IETF considered as archive in their exploration for providing a top level media type for packages is actually of a similar goal.) Do you have a beter replacement?
> 
> I think “package” is the correct term, not archive.  I have reached out to the IETF to get them to change as well.

I have added a note that this is not the terminology we use, but the definition itself may still be helpful.

> 
> 
>>> - I have problems with this phrase “ This is, however, different from the cached state of a networked publication, which does not have a separate existence (though can also be used offline).”.  There are many ways to cache, some of which are related to browser-based technology and some of which are not.  But all of which constitute the concept of a “cached and offline” document.   How about just removing this.  I don’t think it adds anything, certainly not at this point in the document.
>>> 
>> 
>> The text (tries to) refer to browser based caches here.
> 
> And my point is that it should not do so, because there is no requirement that ONLY browser-based caches be used as part of the process of caching and/or taking a publication offline.  There is also no requirement that the cached state and the portable state be different.  I believe that it is important that this document be agnostic to the specific technology choices and focus on the goals and requirements.
> 
> 
>> Do you have a better way of formulating this?
> 
> I would just remove the sentence entirely as it adds nothing.
> 
> 

Having re-read the whole paragraph I think I agree that the sentence may be superfluous there, so let us remove it. But (in your original remark) you also questioned the three bullet items; I claim that the reference to the package is important there, and I think the definitions should stay as they are.

> 
> 
>>> Right, this is a bit more complicated. What I think was meant is that the rendering and possibly interactive part of the reading system independent of the state, ie, the change on that is indeed transparent.
> 
> Yes, I agree that the content should look/act the same independent of state.  Just say something that like :).
> 
> 
>>> - The phrase “ It should maintain its integrity over time” isn’t actually something that we, as the file format specification, have any control over. It is more about the media, systems, etc. in which the content is stored.  As such, it should be removed.
>> 
>> Hm. If I reboot my machine, the cache will disappear, but a portable document on my disc will remain. I am not sure what the problem is with this.
> 
> What you talk about is persistence, not integrity.  Integrity has to do with reliability and robustness, which are more tied to things such as media stability, data validation/checksumming, etc.
> 

I have changed this to persistence.

> And actually, there is nothing in the requirements that state that the cache goes away on a reboot.  That would be a specific implementation decision.
> 

That is true, I just used that as an example in my response...

> 
>>> - Are there no other requirements for the portable state?  I believe we had some in our existing use case/requirements specs.   If not, I can think of a few that I would add here.
>>> 
>> 
>> I would very welcome that.
> 
> Here are a few…
> 

These are of course all valid use cases. As for the document:

> - Ability to distribute the publication via non-real-time methods ranging from email to sneaker-net

There is reference to this in the opening paragraph ("Nevertheless, packages that exist (possibly) apart from the network still have a role to play as units that can be stored or transferred. This concept is essential with the current business models that dominate the publishing industry for, e.g., digital books.") I wonder whether this warrants a separate section

> - Ability to read the publication behind a firewall or secured network
> - Ability to perform preflight & validation on a stable set of content

Again, these are valid use cases, but are they peculiar to the portable state? After all, online content should be reachable behind the firewall if one has the right access credentials; we would be misunderstood as if we said that a publication can only be consumed as a portable state if within the enterprise...


> 
> 
>>> This is something I will have to think more about. The issue is that the streamability may make something different depending on the state.
> 
> Again, I think we have a terminology problem here. What you describe is the ability to get live (or updated) data/content which is a completely different requirement than streamability.  The ability to stream is well defined in the first sentence - “ It must be possible for a client to fetch components of a package in any order, or to fetch multiple components at the same time, without having to read the entire document”.  That has NOTHING to do with being connected or offline - it has to do how the RS is able to access to the content.
> 
> If you want to also add a separate requirement around the ability for the package to be able to specify that specific pieces of content (assets) within the package are not necessarily embedded but instead are retrieved live (with optional caching) - I think that would be a welcome requirement.  Or you could just merge this with "Updates new components only”?
> 

Reading it through again, I believe there is indeed a conflation of concepts. I have actually divided up into three different sections:

- Streaming (following the definition of streaming on wikipedia[1])
- Random access to content
- External (non embedded) references, which is your last example

I moved some of the use cases around accordingly.


[1] https://en.wikipedia.org/wiki/Streaming_media


> 
>>> - In the Package in a Package section, you have “This is trivially available in online and cached states, but puts an extra requirement on portable states.”  This appears to be a copy/paste from elsewhere, as it doesn’t belong here because it’s simply not true in this case.  Please remove.
>> 
>> It is a copy paste indeed, but is it incorrect? (It may be superfluous, though).
> 
> Given that we don’t actually know what an “online state”, a “cached state” or even a “portable state” look like from a technical perspective - it is impossible to make any comment on the ability to implement such.
> 

You are right. I removed these types of comments. At some point, a more careful technical analysis will be needed for each of those requirements, but this is indeed not the place…

Thanks!

Ivan

> 
>>> - The Access to package section also has a similar note about “trivially available” which is also not true, and I would recommend removal as well.
>>> 
>> 
>> I have re-written that sentence in a way that, I believe, is correct…
> 
> I don’t see the change yet, but as with the previous statement - I don’t see how we can make any comments on implementation complexity until we know what is being implemented.
> 
> 
>> Thanks
> 
> Thank you for taking the time to review my comments.
> 
> 
> Leonard


----
Ivan Herman, W3C
Digital Publishing Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704





Received on Thursday, 20 August 2015 09:10:25 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:08 UTC