Re: [DPUB][Locators]Cancellation and Next Steps from Leonard Rosenthol on 2016-01-25 (public-digipub-ig@w3.org from January 2016)

From: Leonard Rosenthol <lrosenth@adobe.com>
Date: Mon, 25 Jan 2016 13:01:04 +0000
To: Ivan Herman <ivan@w3.org>, Romain <rdeltour@gmail.com>
CC: Bill Kasdorf <bkasdorf@apexcovantage.com>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-ID: <132AA25F-99A7-4881-8B33-6155A442ECF1@adobe.com>
Ivan, I’ve read over this a few times and while I can see the benefits of your model, I am still not 100% convinced that it will actually work in the real world (either clients or server).  It seems like something we’ll need to actually try out…

But let’s use it for a basis for conversation.

If the URL for the publication (in whatever state) is “http://ex.org/ThePublication”, then if I want that picture of the Mona Lisa, which I happen to know (maybe from the manifest?) is in the images directory of that publication, can I then reference it directly as http://ex.org/ThePublication/images/MonaLisa.jpg?  I would think so, yes?   So I wouldn’t need any special !, #, etc, correct?


>I could imagine that I make a local copy of a publication in a package, which includes the author-provided manifest, but I would like to add some >additional information valid for my site only (eg, other redirections) that I 'attach' to the publication through the GET without modifying the package. >Ie, both can be useful.
>
While I can see the usefulness of being able to do that – it also scares me as both a content author and a provider of security solutions.  If some process starts returning a different set of bits, then they won’t match for signatures, etc.


Leonard

From: Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>>
Date: Monday, January 25, 2016 at 5:32 AM
To: Romain <rdeltour@gmail.com<mailto:rdeltour@gmail.com>>
Cc: Leonard Rosenthol <lrosenth@adobe.com<mailto:lrosenth@adobe.com>>, Bill Kasdorf <bkasdorf@apexcovantage.com<mailto:bkasdorf@apexcovantage.com>>, W3C Digital Publishing IG <public-digipub-ig@w3.org<mailto:public-digipub-ig@w3.org>>
Subject: Re: [DPUB][Locators]Cancellation and Next Steps


On 21 Jan 2016, at 18:10, Romain <rdeltour@gmail.com<mailto:rdeltour@gmail.com>> wrote:


On 21 Jan 2016, at 07:28, Ivan Herman <ivan@w3.org<mailto:ivan@w3.org>> wrote:

I think that is where HTTP content negotiation should come in to the picture in my view.
(snip)
The only thing this mechanism requires is to have a distinct media type assigned to a PWP (akin to the media type for EPUB) and, probably, to the manifest. If that is there, a client may express the media types it accepts, and even the relative priority of what format it prefers (if the client has several)

This allows for a setup where there is *no* packaged form around. And content negotiation is a mechanism implemented by all servers and clients these days, so we should just use it…

I'm not convinced by the content negotiation approach. The format to get (expanded vs. packaged) may depend on the user's intent (e.g. whether she wants to start reading the book or download it to share on a USB key); the decision is not necessarily under the responsibility of client software.
I believe it would be easier if the two formats were represented by two different URLs, which has the benefit of working in run-of-the-mill browsers.

I think for a number of potential applications (like, for example, assigning annotations) it is important to have the same URL for the various formats ('states', as we referred to them).

The 'usual' approach taken by content negotiations is something like:

- http://ex.org/ThePublication - is the URL of the resources
- http://ex.org/ThePublication.pack - is the URl of the packaged version (if any)
- http://ex.org/ThePublication.unpack - is the URI of the unpacked version (if any)

The client uses the first URL with some preferences to get to either the packaged file or the directly to the document on the Web. Explicit addressing of the, say, package is there if one wants to copy the file on a USB stick.

We could adopt something like that. The important point is that http://ex.org/ThePublication is the 'canonical' URL for the publication; it is not the same as the identifier, because it binds to a specific place on a specific server (and may change if I make a copy to myself), but it is, sort of, canonical nevertheless. More importantly, this is the URL that is considered to be the 'base' when considering locators within the document.


I think we agree that whatever is returned, it should give *an access* (in the conceptual sense) to a manifest (and we would not have to go into the syntax of the manifest here).

Yes, possibly with some level of indirection.

Correct


The return may be

- The full actual data if there is a packaged form;

OK.

the HTTP return header MAY (SHOULD?) also return a link to a manifest

Are you thinking of using a standard HTTP header or defining a custom one?


Standard HTTP. Formally, we can have something like

LINK: <http://url.to.the.manifest>; rel=<http://identifies.the.manifest.format>

The details must be clarified, but I believe that is a correct approach standard-wise.


but the package MUST contain a manifest in any case (we have to decide which of the manifest have priority).

If the package format allows a client to easily retrieve the manifest, this is probably enough. Why use an HTTP header in addition?
I'd suggest that making the manifest easily and deterministically retrievable from the package becomes a requirement.

I could imagine that I make a local copy of a publication in a package, which includes the author-provided manifest, but I would like to add some additional information valid for my site only (eg, other redirections) that I 'attach' to the publication through the GET without modifying the package. Ie, both can be useful.


- The manifest itself, e.g., in JSON format if that is indeed the syntax we adopt, or an HTML file that includes the manifest, or an HTML file that links to the manifest

yes, for the unpackaged state.

Correct.

Thanks

Ivan


Romain.



Ivan




On the manifest question, I think that the discussion taking place for EPUB about a JSON-based manifest may be useful here as there is definitely overlap in the organization and structure of that material that we would also want here.  And if we could potentially align these two efforts to a single manifest format, then it would make it trivial for implementations to author and provide it (no transcoding required).   But yes, there would need to be more stuff from PWP’s perspective (such as the optional mapping for external resources)


Leonard

From: Bill Kasdorf <bkasdorf@apexcovantage.com<mailto:bkasdorf@apexcovantage.com>>
Date: Wednesday, January 20, 2016 at 9:19 AM
To: "public-digipub-ig@w3.org<mailto:public-digipub-ig@w3.org>" <public-digipub-ig@w3.org<mailto:public-digipub-ig@w3.org>>
Subject: [DPUB][Locators]Cancellation and Next Steps
Resent-From: <public-digipub-ig@w3.org<mailto:public-digipub-ig@w3.org>>
Resent-Date: Wednesday, January 20, 2016 at 9:20 AM

Hi, folks—

Today's Locators Task Force meeting is cancelled, but our Task is not. ;-)

It has been suggested by several people that focusing on the actual structure of the locator, and getting a strawman proposal written down, is what we need to do now.

There has been some interesting discussion on the list:

https://lists.w3.org/Archives/Public/public-digipub-ig/2015Dec/0163.html (from Daniel Weck)
https://lists.w3.org/Archives/Public/public-digipub-ig/2016Jan/0095.html  (from Ángel González)

Ivan suggests that we need to write down:

- what should a GET return for a locator (something which is or either refers to a manifest in the abstract sense)
- what should a manifest, conceptually, include. At this moment, I see
                - an *identifier*
                - a mapping from absolute URL-s to relative URL-s (where relative means relative to the PWP instance URL)
                - a mapping from relative URL-s to absolute URL-s

Could somebody volunteer to draft a strawman proposal that we can use for the basis of discussion going forward?

--Bill

Bill Kasdorf
Vice President, Apex Content Solutions
Apex CoVantage
W: +1 734-904-6252
M: +1 734-904-6252
@BillKasdorf<http://twitter.com/#!/BillKasdorf>
bkasdorf@apexcovantage.com<x-msg://17/bkasdorf@apexcovantage.com>
http://isni.org/isni/0000000116490786

https://orcid.org/0000-0001-7002-4786<https://orcid.org/0000-0001-7002-4786?lang=en>
www.apexcovantage.com<http://www.apexcovantage.com/>

<image001.jpg>

<image001.jpg>


----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/

mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704








----
Ivan Herman, W3C
Digital Publishing Lead
Home: http://www.w3.org/People/Ivan/

mobile: +31-641044153
ORCID ID: http://orcid.org/0000-0003-0782-2704
Received on Monday, 25 January 2016 13:01:47 UTC