Re: For the discussion on the PWP from Leonard Rosenthol on 2016-11-23 (public-digipub-ig@w3.org from November 2016)

From: Leonard Rosenthol <lrosenth@adobe.com>
Date: Wed, 23 Nov 2016 06:52:02 +0000
To: Hadrien Gardeur <hadrien.gardeur@feedbooks.com>, Ivan Herman <ivan@w3.org>
CC: W3C Digital Publishing IG <public-digipub-ig@w3.org>, Ric Wright <rkwright@geofx.com>, Laurent Le Meur <laurent.lemeur@edrlab.org>
Message-ID: <116D64F6-8F89-44E9-9466-89210EB4E972@adobe.com>
I agree with David.  It is a very important goal, IMO, that resources shall not be changed when putting them into a package.  The contents need to be identical in packaged and unpackaged form.  The “manifest” is responsible for providing to the UA/RS any remapping that is required.   And yes, I agree that this has implementation implications – but that is part of what we will need to weigh as we move towards them.

Following your examples for User A – I agree with the concepts you’ve presented there.  Ivan and I have had some disagreement about such things as whether details of the manifest are public, what a GET on the identifier returns, etc.  But those details aside – yes all thigns you've listed below are true for A.
And for B – your use of hrefsrc is exactly the type of thing that I would envision as well.  HOWEVER, you are only looking at the easy case – the base HTML files.   Now try to do the same thing for an image referenced by c001.html – in that case, you need TWO hrefsrc – one for the canonical URL and one for the relative mapping inside the package. This also raises the really hard (from a security perspective) case of when c001.html (in this instance) uses an absolute URL to refer to something (eg. our Mona Lisa example at http://www.louvre.com/monalisa.jpg) - but now you need to make this fully self-contained AND not modify the HTML.

I agree that we should look at the Selectors in detail – but I am not (yet) convinced we need to limit them.  But that will come with study.

Leonard

From: Hadrien Gardeur <hadrien.gardeur@feedbooks.com>
Date: Wednesday, November 23, 2016 at 12:30 PM
To: Ivan Herman <ivan@w3.org>
Cc: W3C Digital Publishing IG <public-digipub-ig@w3.org>, Ric Wright <rkwright@geofx.com>, Laurent Le Meur <laurent.lemeur@edrlab.org>
Subject: Re: For the discussion on the PWP
Resent-From: <public-digipub-ig@w3.org>
Resent-Date: Wednesday, November 23, 2016 at 12:31 PM

You mean links in the resources within a WP? My answer would be no. The resources, when put into a package, should be unchanged (with a possible exception for the manifest, maybe).

But if they're unchanged, then we hit the issue that I've described before, where it becomes tricky for a reading system to properly display such a packaged publication.
You either have to dynamically create your own Service Worker, act as a proxy for a a webview (same idea than a Service Worker) or rewrite those links dynamically in the RS.

Now, to go back to the example that Dave has provided, let's get into a little more details:

  *   Publisher P publishes Orlando and uses https://www.publisher-P/new/Orlando/ as the unique identifier for it
  *   The manifest for Orlando is available at https://www.publisher-P/new/Orlando/manifest.json

User A gets access directly to the Web Publication Manifest (I'm using the syntax that I've used until now to illustrate this entirely):

{
  "metadata": {
    "identifier": "https://www.publisher-P/new/Orlando/",
    "title": "Orlando"
  },

  "links": [
    {"rel": "self", "href": "https://www.publisher-P/new/Orlando/manifest.json", "type": "application/webpub+json"}
  ],

  "spine": [
    {"href": "c001.html", "type": "text/html"},
    {"href": "c002.html", "type": "text/html"},
    {"href": "c003.html", "type": "text/html"},
    {"href": "c004.html", "type": "text/html"}
  ]
}

To reference "c001.html" in this specific publication, a locator could either use:

  *   the identifier (https://www.publisher-P/new/Orlando/) + https://www.publisher-P/new/Orlando/c001.html

  *   the canonical link to the manifest (https://www.publisher-P/new/Orlando/manifest.json) + https://www.publisher-P/new/Orlando/c001.html

All three references should remain stable, no matter how you access the publication.

Now User B gets access to a packaged version of that book. There are many different ways this publication could have been packaged, for instance:

  *   by the publisher, at the same time that the Web Publication itself was published on the Web
  *   by a third party client, that simply accessed the manifest and its resources to create a new package
While I've read quite a few times in this group mentions that packaging the publication should not impact the path to the resources, I think that this is completely unrealistic. But frankly, it doesn't really matter as long as the canonical location of each resource is preserved.

For instance, let's say that User B version of the same packaged publication looks like this:

{
  "metadata": {
    "identifier": "https://www.publisher-P/new/Orlando/",
    "title": "Orlando"
  },

  "links": [
    {"rel": "self", "href": "https://www.publisher-P/new/Orlando/manifest.json", "type": "application/webpub+json"}
  ],

  "spine": [
    {"href": "chapter1.html", "hrefsrc": "https://www.publisher-P/new/Orlando/c001.html", "type": "text/html"},
    {"href": "chapter2.html", "hrefsrc": "https://www.publisher-P/new/Orlando/c002.html", "type": "text/html"},
    {"href": "chapter3.html", "hrefsrc": "https://www.publisher-P/new/Orlando/c003.html", "type": "text/html"},
    {"href": "chapter4.html", "hrefsrc": "https://www.publisher-P/new/Orlando/c004.html", "type": "text/html"}
  ]
}

It doesn't really matter if one packaged version renamed "c001.html" to "chapter1.html" or moved it to a different folder in the package, I can still find back what the reference is by using "hrefsrc".


Hadrien – I have been looking towards the Selector model of the Web Annotation specification as ways that we might be able to specify text locations in a reliable fashion (and without re-inventing the wheel).  It provides a number of well-defined (and implemented/tested) models for how to refer to either specific semantic text pieces, arbitrary text or ranges of both/either.  This should hopefully remove the need for fragments – at least in the context of a larger environment.

Leonard, I don't really think it does, but the selector model is a good starting point.
The main issue that I have with it, is that there are many different options for such selectors. Some are quite stable but lack precision, others are quite the opposite.
We might not need (or want) to define a brand new fragment identifier, but we'll need to take a close look at the selectors that are available and have a clear decision about the ones that we'll actually use.

Hadrien
Received on Wednesday, 23 November 2016 06:52:37 UTC