Re: For the discussion on the PWP from Ivan Herman on 2016-11-22 (public-digipub-ig@w3.org from November 2016)

From: Ivan Herman <ivan@w3.org>
Date: Tue, 22 Nov 2016 17:10:58 +0100
To: Hadrien Gardeur <hadrien.gardeur@feedbooks.com>
Cc: W3C Digital Publishing IG <public-digipub-ig@w3.org>, Ric Wright <rkwright@geofx.com>, Laurent Le Meur <laurent.lemeur@edrlab.org>
Message-Id: <0B671DC4-96BE-42B8-8F63-BA1D6CF5E9D2@w3.org>
Hadrien,

sorry for the late reply; actually, some of the issues may be moot by virtue of yesterday's discussion.

The kind of use case that was discussed many months ago and influenced the discussion was something like this (I try to reproduce the issue, I am sure that Ben or Leonard or others who were part of the discussion will correct me if I get it wrong).

Imagine publisher P published a WP. Then library L makes a copy of WP on their system, that user U as access to. Whether the WP is packaged or not, whether WP is offline or not should not be an issue. Another user B then wants to make a reference to a resource within that WP; the question is: what reference, ie, what URL B has to use so that U would be able to handle automatically within his/her reading system? (B does not know the URI of the copy U has access to on L.) The idea was to have an environment whereby a reading system (eg a browser) of L, when getting the right URI that we want to define would automatically find out that this reference is actually related to the WP that is used by U.

The approach we were considering is that the WP has a canonical URI for it (which is probably the one provided by P), and this canonical URI is part of the WP's manifest. The reading system would be responsible, in some sense, for the transformation of the URI sent by B into the local address on L.

The original design was more complicated because there was still a rigid separation between, say, offline and online. If we simply forget about this (essentially because a transformation from online to online is done almost automatically by a service worker, for example) then part of the complication disappears.

But the complication you describe for the packaged state is still around if we let things be modified, and we may want to simplify and, therefore, forget about some of the issues. After all, we may be stepping on the area of FRBR, on what the URI of the 'work' is, etc. So we may decide to give up on that level of complication; I do not have a clear opinion at this moment I must admit.

Maybe this helps, though I am not sure…

Ivan



> On 16 Nov 2016, at 11:36, Hadrien Gardeur <hadrien.gardeur@feedbooks.com> wrote:
> 
> Hello Ivan,
> 
> I'm adding Ric & Laurent since this also concerns reading systems directly.
> 
> It's not entirely clear to me what canonical locators are used for, a few things comes to mind:
> on the Web, a link@rel="canonical" is used when multiple URIs return the same resource, the URI referenced in that link is then considered to be the canonical location for that resource
> it seems that there's also a use case for packaged publications, where you might want to update individual resources by keeping the original URI in the manifest
> and finally you seem to describe a use case based around redirections, if I'm not misunderstanding your previous email
> When the world "canonical locator" is used, I tend to think strictly about the first use case, for which I'm not sure that we need to do much. Shouldn't we simply let HTTP do its job and eventually provide a Link header with rel="canonical"?
> 
> For the second use case, this involves packaged publications and reading systems and becomes potentially complex:
> first of all, using "hrefsrc" or a similar key should work fairly well for that purpose
> in order to update individual resources in the package, we could then rely on the URI in "hrefsrc", but I don't know when/how this content should be updated and what happens if the original version is deleted/modified without proper HTTP status codes being returned
> overall, I think it's much easier to update the package as a whole, by keeping a link to the original manifest in the packaged version
> but intercepting "https://example.org/books/1/img/mona_lisa.jpg <https://example.org/books/1/img/mona_lisa.jpg>" and serving "/img/mona_lisa.jpg" from the package instead isn't necessarily easy for a reading system. Since most of them are based on a webview, you would either have to:
> generate dynamically a Service Worker for each publication based on the info that you extract from the manifest, and then inject that SW in the publication's resources. This means that you need to serve all your local resources using HTTPS and a webview that supports SW, which might be tricky on some platforms (iOS for instance). I also need to double check if SW work on localhost and on any port.
> the other option would be to rewrite all URIs referenced in the manifest and used in the publication's resources, which is something that IMO we'd like to avoid with Readium for instance
> same problem the other way around if we'd like to say something like "prioritize the resources available on the Web vs those in the package"
> While I can understand the potential benefits if we can figure this out, this might be a very challenging problem to solve for people building reading systems.
> 
> Am I missing anything or misunderstanding the use cases for canonical locators?
> 
> Thanks,
> Hadrien
> 
> 2016-11-15 18:00 GMT+01:00 Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>>:
> Hi Hadrien,
> 
>> On 15 Nov 2016, at 17:34, Hadrien Gardeur <hadrien.gardeur@feedbooks.com <mailto:hadrien.gardeur@feedbooks.com>> wrote:
>> 
>> Hello Ivan,
>> 
>> Just a quick note: this document uses "pwp_manifest" as the rel value to discover a manifest, but I believe that we should actually use the same rel value ("manifest") as the Web App Manifest, just with a different media type.
>> We don't really need a dedicated relationship for PWP since the relationship isn't affected by the format of the manifest.
> 
> Probably. To be honest, the document did not really go into these details, nor I am sure it should (this may just be an input to a possible WG, and the details will have to be clarified at that point).
> 
> But I am fine changing it right now. Can you make a pull request?
> 
>> 
>> For the canonical locator, I'm still not sure that I understand fully what this will be used for (there are potentially a lot of use cases), but could this behave slightly like "hreflang", by providing a hint on a link?
>> 
>> For example:
>> {"href": "img/mona_lisa.jpg", "hrefsrc": "https://example.org/books/1/img/mona_lisa.jpg <https://example.org/books/1/img/mona_lisa.jpg>", "type": "image/jpeg"}
>> 
> 
> Yes, except that it is probably two-directional. (But all this is still/again a bit in the air.) Two directional in the sense that if a renderer receives  https://example.org/books/1/img/mona_lisa.jpg <https://example.org/books/1/img/mona_lisa.jpg> then it should get to "img/mona_list.jpg". A functionality that may be covered by a SW in the background, actually; that text was written before we _really_ dived into the SW world. Let alone the fact that SW may not be the _only_ implementation vehicle.
> 
> Cheers
> 
> Ivan
> 
> 
> 
>> Hadrien
>> 
>> 2016-11-15 17:16 GMT+01:00 Ivan Herman <ivan@w3.org <mailto:ivan@w3.org>>:
>> I made a first re-shuffling of the PWP draft
>> 
>> http://w3c.github.io/dpub-pwp/ <http://w3c.github.io/dpub-pwp/>
>> 
>> mostly along the lines of
>> 
>> https://github.com/w3c/dpub-pwp/blob/gh-pages/TODO.md <https://github.com/w3c/dpub-pwp/blob/gh-pages/TODO.md>
>> 
>> 'Mostly', because I made copy-pastes from the previous version and some of the items in the TODO are in a single section now. But, I believe, the content is there.
>> 
>> I will not touch this document until next week and, afaik, a more detailed discussion will happen on the call on Monday. Until then, have a look at it and, of course, feel free to contribute to the text!
>> 
>> Ivan
>> 
>> ----
>> Ivan Herman, W3C
>> Digital Publishing Technical Lead
>> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
>> mobile: +31-641044153 <tel:%2B31-641044153>
>> ORCID ID: http://orcid.org/0000-0003-0782-2704 <http://orcid.org/0000-0003-0782-2704>
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> --
>> Hadrien Gardeur
>> Co-founder, Feedbooks
>> http://www.feedbooks.com <http://www.feedbooks.com/>
>> T: +33.6.63.28.59.69 <tel:%2B33.6.63.28.59.69>
>> E: hadrien.gardeur@feedbooks.com <mailto:hadrien.gardeur@feedbooks.com>
>> 54, rue de Paradis
>> 75010 Paris, France
> 
> 
> 
> ----
> Ivan Herman, W3C
> Digital Publishing Technical Lead
> Home: http://www.w3.org/People/Ivan/ <http://www.w3.org/People/Ivan/>
> mobile: +31-641044153 <tel:%2B31-641044153>
> ORCID ID: http://orcid.org/0000-0003-0782-2704 <http://orcid.org/0000-0003-0782-2704>
> 
> 
> 
> 
> 
> 
> 
> --
> Hadrien Gardeur
> Co-founder, Feedbooks
> http://www.feedbooks.com <http://www.feedbooks.com/>
> T: +33.6.63.28.59.69
> E: hadrien.gardeur@feedbooks.com <mailto:hadrien.gardeur@feedbooks.com>
> 54, rue de Paradis
> 75010 Paris, France


----
Ivan Herman, W3C
Digital Publishing Technical Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Tuesday, 22 November 2016 16:11:18 UTC