W3C home > Mailing lists > Public > public-digipub-ig@w3.org > February 2016

Re: [dpub-loc] 20160217 minutes

From: Romain <rdeltour@gmail.com>
Date: Thu, 18 Feb 2016 18:23:43 +0100
Cc: Daniel Weck <daniel.weck@gmail.com>, Leonard Rosenthol <lrosenth@adobe.com>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
Message-Id: <6B0D15D4-96B4-4B49-8975-96AF37F988AD@gmail.com>
To: Ivan Herman <ivan@w3.org>

> On 18 Feb 2016, at 17:59, Ivan Herman <ivan@w3.org> wrote:
>> On 18 Feb 2016, at 16:40, Romain <rdeltour@gmail.com> wrote:
>> Also, I don't think that Lp and Lu are part of M (correct?), so do we agree about extending the statement to :
>> "The answer to HTTP Get http://book.org/published-books/1 must make M, Lp, and Lu available to the PWP Processor".
> Essentially yes, although my formulation would be slightly different. This was a detail that Leonard and I discussed; the way I would prefer to formulate is in[1], essentially saying that M is a conceptual entity that does include the L-s and the PWP processor combines the various sources of information to glean everything it contains (including the Lp and Lu values). Ie, in practice, the processor may receive part of the information from the manifest file in the packaged version, and some through the LINK header.
> I have not yet changed the text accordingly.

OK good! I knew I had seen this discussed... but then I couldn't find it in the text which I thought was up-to-date. Thanks for the clarification :)

>  I certainly believe that we should not (even if we are normative) require one and only one possible server setup. I would _not_ require to use content negotiation as the only mechanism, but I would equally _not_ require a mechanism that makes content negotiation impossible or unused. There should be several scenarios the server maintainers could choose from. Whether such a list should be standard, whether such list should be exhaustive; I do not know. My gut feeling is neither… Because we do not produce anything normative, that is actually for later anyway.

Fair enough. I still believe that eventually we (or a proper WG) should probably specify the precise mechanism, but I agree it can be for later.


> Ivan
>> Romain.
>>> Ivan
>>> P.S. I am also not fully sure what you want to show with the github example, I must admit. But it seems to reflect a particular github (server:-) setup. Let me give another example: you can run the following curl-s:
>>> curl --head http://www.w3.org/ns/oa
>>> curl --head --header "Accept: application/ld+json" http://www.w3.org/ns/oa
>>> curl --head --header "Accept: text/turtle" http://www.w3.org/ns/oa
>>> these will return the same conceptual content (a vocabulary) in HTML (with the vocabulary in RDFa), in JSON-LD, or in turtle, using the same canonical URL for the vocabulary itself. This requires a different server setup.
>>>> On 18 Feb 2016, at 14:04, Daniel Weck <daniel.weck@gmail.com> wrote:
>>>> Hello,
>>>> here's a concrete example (unrelated to PWP) which I think illustrates
>>>> the comments made during the concall, regarding content negotiation
>>>> vs. dereferencing URL endpoints to "meta" data about the publication
>>>> locators for unpacked / packed states.
>>>> Let's consider the GitHub HTTP API, the w3c/dpub-pwp-loc GitHub
>>>> repository, and the README.md file located at the root of the
>>>> gh-branch. There's a "canonical" URL for that (you can safely click on
>>>> the links below):
>>>> curl --head https://api.github.com/repos/w3c/dpub-pwp-loc/readme
>>>> ==> Content-Type: application/json; charset=utf-8
>>>> curl https://api.github.com/repos/w3c/dpub-pwp-loc/readme
>>>> ==> "url": "https://api.github.com/repos/w3c/dpub-pwp-loc/contents/README.md?ref=gh-pages"
>>>> As a consumer of that JSON-based API, I can query the actual payload
>>>> that I'm interested in:
>>>> curl https://api.github.com/repos/w3c/dpub-pwp-loc/contents/README.md?ref=gh-pages
>>>> ==> "content": "BASE64"
>>>> Now, back to PWP:
>>>> State-agnostic "canonical" URL:
>>>> https://domain.com/path/to/book1
>>>> (note that this could also be a totally different syntax, e.g.
>>>> https://domain.com/info/?get=book1 or
>>>> https://domain.com/book1?get=info etc. for as long as a request
>>>> returns a content-type that a PWP processor / reading-system can
>>>> consume, e.g. application/json or application/pwp-info+json ... or XML
>>>> / whatever)
>>>> A simple request to this URL could return (minimal JSON example, just
>>>> for illustration purposes):
>>>> {
>>>> "packed": "https://domain.com/path/to/book1.pwp",
>>>> "unpacked":
>>>> "https://domain.com/another/path/to/book1/manifest.json"  /// (or
>>>> container.xml, or package.opf ... :)
>>>> }
>>>> Once again, there is no naming convention / constraint on the "packed"
>>>> URL https://domain.com/path/to/book1.pwp which could be
>>>> https://domain.com/download/book1 or
>>>> https://download.domain.com/?get=book1 , as long as a request returns
>>>> a payload with content-type application/pwp+zip (for example). Note
>>>> that the book1.pwp archive in my example would contain the "main entry
>>>> point" manifest.json (thus why I made a parallel above with EPUB
>>>> container.xml or package.opf)
>>>> The "unpacked" URL path
>>>> https://domain.com/another/path/to/book1/manifest.json does not have
>>>> to represent the actual file structure on the server, but it's a
>>>> useful syntactical convention because other resource files in the PWP
>>>> would probably have similarly-rooted relative locator paths (against a
>>>> given base href), e.g.:
>>>> https://domain.com/another/path/to/book1/index.html
>>>> https://domain.com/another/path/to/book1/images/logo.png
>>>> In other words, if the packed book1.pwp contains index.html with <img
>>>> src="./images/logo.png" />, it does make sense for the online unpacked
>>>> state to use the same path references (as per the example URLs above).
>>>> Publishers may have the option to route URLs any way they like, e.g.
>>>> <img src="?get_image=logo.png" />, but we know there is the issue of
>>>> mapping document URLs in packed/unpacked states with some canonical
>>>> locator, so that annotation targets can be referenced and resolved
>>>> consistently. So it would greatly help if the file structure inside
>>>> the packed book1.pwp was replicated exactly in the URL patterns used
>>>> for deploying the unpacked state.
>>>> To conclude, I am probably missing something (Ivan and Leonard, you
>>>> guys are ahead of the curve compared to me), but I hope I managed to
>>>> convey useful arguments. Personally, as a developer involved in
>>>> reading-system implementations, and as someone who would like to
>>>> continue deploying content with minimal server-side requirements, I am
>>>> not yet convinced that content negotiation is needed here. As an
>>>> optional feature, sure, but not as the lowest common denominator.
>>>> Thanks for listening :)
>>>> Regards, Dan
>>>> On Thu, Feb 18, 2016 at 12:04 PM, Ivan Herman <ivan@w3.org> wrote:
>>>>> With the caveat that the minutes are always difficult to read (Romain, that
>>>>> is not your fault, it is the case for most of the minutes; I know only a few
>>>>> people who write perfect minutes, and I am certainly not among them) maybe
>>>>> some comments on my side. More about this next time we can all talk
>>>>> (although it seems that this will only be in two weeks, due to the Baltimore
>>>>> EDUPUB meeting).
>>>>> First of all, this comment:
>>>>> [[[
>>>>> rom: my issue is that the spec doesn't say "if Lu exists then L must be Lu",
>>>>> I think we should consider it
>>>>> ]]]
>>>>> I do not see why we should say anything like that. It is of course correct
>>>>> that, in many cases, it makes a lot of sense to have Lu=L. But I do not see
>>>>> why we should restrict it this way. In general, the approach I tried to
>>>>> follow in my writeup is to be as permissive as possible and put the minimum
>>>>> possible hard requirements on the locator setup. It is probably worth adding
>>>>> a note in the text (or the more final text) that Lu may be equal to L (in
>>>>> fact, this may very well be a widely used approach) but I would not want to
>>>>> go beyond that.
>>>>> Then there is the whole issue about content negotiations… It seems that we
>>>>> have a disagreement on the value and usage of content negotiations. I do not
>>>>> agree with Daniel's statement that "in a RESTful API the URL would
>>>>> consistently return the same content type". It is certainly not the
>>>>> practice, nor should it be. Content negotiation is widely used when tools
>>>>> want to retrieve, for example the best syntax that encodes a particular
>>>>> information (typical example is in RDF land, where tools may or may not have
>>>>> parsers for a particular RDF serialization), this is how dbpedia is set up
>>>>> etc. (I did told you about the way RDF namespace documents are set up on our
>>>>> site, for example. It is pretty much general practice to do that.) I must
>>>>> admit I also do not agree with Daniel's remark on "content negotiation based
>>>>> on (sophisticated) HTTP headers sounds counter intuitive". Content
>>>>> negotiations is certainly very intuitive to me...
>>>>> All that being said, and that is where maybe there is actually a minor
>>>>> disagreement between Leonard and I: I do not say that content negotiation is
>>>>> the only approach to set up a server storage. The text I wrote is
>>>>> deliberately open ended insofar as it described what the client expectation
>>>>> is when that GET request is issued in general terms, and the choice among
>>>>> the various alternatives are all the server's. The list of possible server
>>>>> behaviours in the text are possible alternatives, instead of hard
>>>>> requirements. The client is responsible in following the various possible
>>>>> paths and, maybe, we will have to describe those possibilities later in more
>>>>> details (precise usage of the LINK header, the <link> element, media types,
>>>>> etc), but that gives the liberty to set up the server the way the publisher
>>>>> wants. If we accept this approach, ie, that the client has some complexity
>>>>> to resolve in favour of a variety of possible server setups, then I do not
>>>>> think there is a major disagreement among us.
>>>>> Talk to you guys later…
>>>>> Ivan
>>>>> B.t.w., a more general and slightly philosophical comment: we should not be
>>>>> afraid of really using HTTP:-) The various header information in both the
>>>>> request and response headers of an HTTP request/response are very rich and
>>>>> sophisticated. There are many situations, on expiration dates, on security,
>>>>> etc, and of course content negotiations that can be expressed via these HTTP
>>>>> headers, and we should not shy away using those whenever we can and it makes
>>>>> sense. As I showed in one of may mails it is not that complex to set up
>>>>> (actually, and to be fair, setting up content negotiations is probably the
>>>>> more complex thing, I accept that).
>>>>> If you are interested by the various possibilities, this site may be of
>>>>> interest:
>>>>> https://github.com/dret/sedola/blob/master/MD/headers.md
>>>>> On 18 Feb 2016, at 09:24, Romain <rdeltour@gmail.com> wrote:
>>>>> On 18 Feb 2016, at 02:49, Leonard Rosenthol <lrosenth@adobe.com> wrote:
>>>>> Actually, the big issue that I took away from the minutes is that ivan and I
>>>>> are in agreement that content negotiation (via standard web technique incl.
>>>>> the Accept header) is the proper way for the client & server to decide what
>>>>> to return on the GET from the canonical locator.   Daniel, however, appears
>>>>> (from the minutes) to be promoting a completely different approach.
>>>>> As stated before [1], I am absolutely not convinced that content negotiation
>>>>> is a good approach.
>>>>> I want to upload a PWP tomorrow to a static file hosting service; if conneg
>>>>> is required I can't do that.
>>>>> More to the point: how to you GET the (manifest + Lu + Lp) info with the
>>>>> conneg solution? Maybe I just miss something.
>>>>> Finally, may I turn the question the other way around: what are the benefits
>>>>> of content negotiation for the canonical locator? (compared to an
>>>>> alternative approach with explicit links in the GET answer (headers or
>>>>> payload).
>>>>> Thanks,
>>>>> Romain.
>>>>> [1] https://lists.w3.org/Archives/Public/public-digipub-ig/2016Jan/0136.html
>>>>> Daniel, if you can explain why you want to do something different from the
>>>>> standard web/REST model, I’d like to understand.
>>>>> Leonard
>>>>> From: Romain <rdeltour@gmail.com>
>>>>> Date: Wednesday, February 17, 2016 at 6:26 PM
>>>>> To: Daniel Weck <daniel.weck@gmail.com>, Leonard Rosenthol
>>>>> <lrosenth@adobe.com>
>>>>> Cc: "DPUB mailing list (public-digipub-ig@w3.org)"
>>>>> <public-digipub-ig@w3.org>, Tzviya Siegman <tsiegman@wiley.com>
>>>>> Subject: Re: [dpub-loc] 20160217 minutes
>>>>> On 17 Feb 2016, at 23:12, Daniel Weck <daniel.weck@gmail.com> wrote:
>>>>> Hi Leonard, that's quite a bold statement, but I suspect the minutes could
>>>>> do with a few corrections.
>>>>> My bad if the minutes are inaccurate, please feel free to amend. It was a
>>>>> bit frustrating too: several times I wanted to talk or precise a point but
>>>>> was busy typing.
>>>>> At any rate, I look forward to the recap from you and Ivan at the next
>>>>> opportunity. PS: it was a small quorum on this concall, but I was under the
>>>>> impression that the participants agreed on the broad lines of your proposal,
>>>>> with only details to clarify.
>>>>> My impression is that participants generally agreed with the presentation of
>>>>> the issues and some principles. I believe that the main point that is still
>>>>> controversial is really what should be the answer to a GET on the canonical
>>>>> locator.
>>>>>> I think we need to go do this over again next week – which si extremely
>>>>>> unfortunate.
>>>>> If I'm not mistaken Matt, Markus, Tzviya and I won't be able to attend
>>>>> (EDUPUB summit).
>>>>> Romain.
>>>>> Regards, Daniel
>>>>> On 17 Feb 2016 9:17 p.m., "Leonard Rosenthol" <lrosenth@adobe.com> wrote:
>>>>>> Sorry that I was unable to attend today, especially since the discussion
>>>>>> (based on the minutes) seems to completely undo all the work that Ivan,
>>>>>> myself and others did on the mailing list during the past week.   The
>>>>>> position presented by Daniel is the exact opposite of what Ivan’s musings
>>>>>> (adjusted based on mail conversations) presented.
>>>>>> I think we need to go do this over again next week – which si extremely
>>>>>> unfortunate.
>>>>>> Leonard
>>>>>> Fro  "Siegman, Tzviya - Hoboken" <tsiegman@wiley.com>
>>>>>> Date: Wednesday, February 17, 2016 at 11:46 AM
>>>>>> To: "DPUB mailing list (public-digipub-ig@w3.org)"
>>>>>> <public-digipub-ig@w3.org>
>>>>>> Subject: [dpub-loc] 20160217 minutes
>>>>>> Resent-From: <public-digipub-ig@w3.org>
>>>>>> Resent-Date: Wednesday, February 17, 2016 at 11:48 AM
>>>>>> Minutes from today’s meeting:
>>>>>> https://www.w3.org/2016/02/17-dpub-loc-minutes.html
>>>>>> Tzviya Siegman
>>>>>> Digital Book Standards & Capabilities Lead
>>>>>> Wiley
>>>>>> 201-748-6884
>>>>>> tsiegman@wiley.com
>>>>> ----
>>>>> Ivan Herman, W3C
>>>>> Digital Publishing Lead
>>>>> Home: http://www.w3.org/People/Ivan/
>>>>> mobile: +31-641044153
>>>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>> ----
>>> Ivan Herman, W3C
>>> Digital Publishing Lead
>>> Home: http://www.w3.org/People/Ivan/
>>> mobile: +31-641044153
>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704

Received on Thursday, 18 February 2016 17:24:34 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:25 UTC