Re: [dpub-loc] 20160217 minutes

2016-02-22 14:15 GMT+01:00 Ivan Herman <ivan@w3.org>:

>
> On 21 Feb 2016, at 16:00, Leonard Rosenthol <lrosenth@adobe.com> wrote:
>
> Ivan – what I thought we had agreed on is that there are two types of
> POSSIBLE PWP Processors – Server and Client.  Any specific implementation
> of a PWP can consist of various combinations of the two.
>
>
> I am not sure we really agreed on that, insofar is that I believe we
> should avoid Server side processors; that what I was arguing with you on
> the phone… I agree there may be situations where something like that may
> become necessary, but I regard this is a niche case. The diagram's goal is
> to provide a more typical case where, in my view, a server configuration is
> the maximum we should go for…
>

In that case, isn't a 'well-configured Apache server' an example of a
Server PWP processor?

>
> Ivan
>
>
> It is just as acceptable to have a smart server/dumb client configuration
> as it is to have a dumb server/smart client.
>
> As such, I support Ben’s approach of creating the separation of concepts.
> And having some POSSIBLE implementation options of how that two might work
> together seems like a good idea.  But we need to keep in mind that we are
> not mandating/proscribing specific implementation – just one set of
> possible ones.
>
> Leonard
>
> From: Ivan Herman <ivan@w3.org>
> Date: Sunday, February 21, 2016 at 4:30 AM
> To: Ben De Meester <ben.demeester@ugent.be>
> Cc: Romain <rdeltour@gmail.com>, Daniel Weck <daniel.weck@gmail.com>,
> Leonard Rosenthol <lrosenth@adobe.com>, W3C Digital Publishing IG <
> public-digipub-ig@w3.org>
> Subject: Re: [dpub-loc] 20160217 minutes
>
> Hi Ben,
>
> thanks for this but… I am not 100% sure this helps.
>
> First of all, I would prefer not to refer to a Server PWP  Processor. This
> suggests that there is a need for a very specific Server to be used with
> PWP, which is something we should avoid. There *may* be, for convenience,
> ways to set up a server with standard configuration facilities in, say,
> Apache, but they do not constitute a 'processor' and they are by no means
> required.
>
> What we will have to specify in more details is, actually, the Client side
> PWP processor. For that purpose, if we want to give some visual
> representation (which I believe is a good idea), a standard flow chart
> seems to be a much better approach. If I find some time, I will try to come
> up with one (but you can of course try to beat me into it:-)
>
> Cheers
>
> Ivan
>
>
>
> On 18 Feb 2016, at 18:08, Ben De Meester <ben.demeester@ugent.be> wrote:
>
> Hi all,
>
> I think we are actually all in massive agreement, but it's just a matter
> of having a minimal conforming system vs enhancements.
> In http://w3c.github.io/dpub-pwp-loc/drafts/minimal-server.seq.violet.html,
> I tried to draw a flow chart what would happen if we would have the most
> minimally configured server (i.e., a file server).
> In http://w3c.github.io/dpub-pwp-loc/drafts/conneg.seq.violet.html, I
> tried to show what would happen if the server allowed conneg: there would
> be one request less to the server, so it would be more efficient, but the
> first example does not exclude the other or vice versa.
> Other improvements are possible as well, an entire spectrum of complex
> client vs complex server can be researched.
>
> Also, I added on the figures the definition that M is a combination of
> Mmanifest and Mlinkset.
>
> It would be great if we could agree on *something like*
> http://w3c.github.io/dpub-pwp-loc/drafts/minimal-server.seq.violet.html
> as a baseline (and of course specify the details better), and allow for
> (and describe) improvements where possible.
>
> Does this look like a good way to move forward?
>
> Greetings,
> Ben
>
> Ben De Meester
> Researcher Semantic Web
> Ghent University - iMinds - Data Science Lab | Faculty of Engineering and
> Architecture | Department of Electronics and Information Systems
> Sint-Pietersnieuwstraat 41, 9000 Ghent, Belgium
> t: +32 9 331 49 59 | e: ben.demeester@ugent.be | URL:
> http://users.ugent.be/~bjdmeest/
>
> 2016-02-18 17:59 GMT+01:00 Ivan Herman <ivan@w3.org>:
>
>>
>> > On 18 Feb 2016, at 16:40, Romain <rdeltour@gmail.com> wrote:
>> >
>> >
>> >> On 18 Feb 2016, at 15:34, Ivan Herman <ivan@w3.org> wrote:
>> >>
>> >> Daniel,
>> >>
>> >> to be honest, I am not sure what you are arguing for or against…
>> >>
>> >> - The fact that the unpacked and packed versions would/should reflect,
>> conceptually, the same file hierarchy: I do not have any problem with that.
>> Although we could imagine having some sort of a 'mapping table' in the PWP
>> manifest to convert among URLs from one state or the other, I do not think
>> that is really all that useful. However, I do not think anything in the
>> current writeups contradicts this; in fact, I believe this issue is pretty
>> much orthogonal on the choice of the Lu, L, Lp, and the relationships among
>> them.
>> >
>> > Right.
>> >
>> >>
>> >> - I did not say that 'content negotiation is the lowest common
>> denominator'. It is one of the possible approaches. I happen to think it is
>> useful and good to have it, others have a different view; that is fine. The
>> only thing in the text is: "The answer to HTTP Get
>> http://book.org/published-books/1 must make M available to the PWP
>> Processor".
>> >
>> > I think we have a consensus on this statement, which is a good start :)
>> >
>> > Also, I don't think that Lp and Lu are part of M (correct?), so do we
>> agree about extending the statement to :
>> >
>> >  "The answer to HTTP Get http://book.org/published-books/1 must make
>> M, Lp, and Lu available to the PWP Processor".
>>
>> Essentially yes, although my formulation would be slightly different.
>> This was a detail that Leonard and I discussed; the way I would prefer to
>> formulate is in[1], essentially saying that M is a conceptual entity that
>> does include the L-s and the PWP processor combines the various sources of
>> information to glean everything it contains (including the Lp and Lu
>> values). Ie, in practice, the processor may receive part of the information
>> from the manifest file in the packaged version, and some through the LINK
>> header.
>>
>> I have not yet changed the text accordingly.
>>
>> [1]
>> https://lists.w3.org/Archives/Public/public-digipub-ig/2016Feb/0093.html
>>
>>
>> >
>> >
>> >> The way to honour that commitment may include several approaches
>> which, if we were writing a standard, would be the only normative
>> statements and are listed (for the time being, there may be more) in the
>> four bullet items as alternatives:
>> >>
>> >>      • M itself (e.g., a JSON file, and RDFa+HTML file, etc., whatever
>> is specified for the exact format and media type of M at some point); or
>> >>      • a package in some predefined PWP format that must include M; or
>> >>      • an HTML, SVG, or other resource, representing, e.g., the cover
>> page of the publication, with M referred to in the Link header of the HTTP
>> Response; or
>> >>      • an (X)HTML file containing the <link> element referring to M
>> >
>> > OK.
>> >
>> >>
>> >> Nothing here prescribes a specific server setup. Again, in standard
>> specification parlance, all the various server setup possibilities are
>> informative and not normative.
>> >
>> > I'm not sure I agree. IMO the mere consensual statement above (whilst
>> important) is not enough; at some point we'll need to be more precise than
>> that.
>> > Well, this depends on the scope/objectives of the TF…
>>
>> But I certainly believe that we should not (even if we are normative)
>> require one and only one possible server setup. I would _not_ require to
>> use content negotiation as the only mechanism, but I would equally _not_
>> require a mechanism that makes content negotiation impossible or unused.
>> There should be several scenarios the server maintainers could choose from.
>> Whether such a list should be standard, whether such list should be
>> exhaustive; I do not know. My gut feeling is neither… Because we do not
>> produce anything normative, that is actually for later anyway.
>>
>> Ivan
>>
>> >
>> > Romain.
>> >
>> >>
>> >> Ivan
>> >>
>> >> P.S. I am also not fully sure what you want to show with the github
>> example, I must admit. But it seems to reflect a particular github
>> (server:-) setup. Let me give another example: you can run the following
>> curl-s:
>> >>
>> >> curl --head http://www.w3.org/ns/oa
>> >> curl --head --header "Accept: application/ld+json"
>> http://www.w3.org/ns/oa
>> >> curl --head --header "Accept: text/turtle" http://www.w3.org/ns/oa
>> >>
>> >> these will return the same conceptual content (a vocabulary) in HTML
>> (with the vocabulary in RDFa), in JSON-LD, or in turtle, using the same
>> canonical URL for the vocabulary itself. This requires a different server
>> setup.
>> >>
>> >>
>> >>
>> >>
>> >>> On 18 Feb 2016, at 14:04, Daniel Weck <daniel.weck@gmail.com> wrote:
>> >>>
>> >>> Hello,
>> >>>
>> >>> here's a concrete example (unrelated to PWP) which I think illustrates
>> >>> the comments made during the concall, regarding content negotiation
>> >>> vs. dereferencing URL endpoints to "meta" data about the publication
>> >>> locators for unpacked / packed states.
>> >>>
>> >>> Let's consider the GitHub HTTP API, the w3c/dpub-pwp-loc GitHub
>> >>> repository, and the README.md file located at the root of the
>> >>> gh-branch. There's a "canonical" URL for that (you can safely click on
>> >>> the links below):
>> >>>
>> >>> curl --head https://api.github.com/repos/w3c/dpub-pwp-loc/readme
>> >>> ==> Content-Type: application/json; charset=utf-8
>> >>>
>> >>> curl https://api.github.com/repos/w3c/dpub-pwp-loc/readme
>> >>> ==> "url": "
>> https://api.github.com/repos/w3c/dpub-pwp-loc/contents/README.md?ref=gh-pages
>> "
>> >>>
>> >>> As a consumer of that JSON-based API, I can query the actual payload
>> >>> that I'm interested in:
>> >>> curl
>> https://api.github.com/repos/w3c/dpub-pwp-loc/contents/README.md?ref=gh-pages
>> >>> ==> "content": "BASE64"
>> >>>
>> >>>
>> >>> Now, back to PWP:
>> >>>
>> >>> State-agnostic "canonical" URL:
>> >>> https://domain.com/path/to/book1
>> >>> (note that this could also be a totally different syntax, e.g.
>> >>> https://domain.com/info/?get=book1 or
>> >>> https://domain.com/book1?get=info etc. for as long as a request
>> >>> returns a content-type that a PWP processor / reading-system can
>> >>> consume, e.g. application/json or application/pwp-info+json ... or XML
>> >>> / whatever)
>> >>> A simple request to this URL could return (minimal JSON example, just
>> >>> for illustration purposes):
>> >>> {
>> >>>  "packed": "https://domain.com/path/to/book1.pwp",
>> >>>  "unpacked":
>> >>> "https://domain.com/another/path/to/book1/manifest.json"  /// (or
>> >>> container.xml, or package.opf ... :)
>> >>> }
>> >>>
>> >>> Once again, there is no naming convention / constraint on the "packed"
>> >>> URL https://domain.com/path/to/book1.pwp which could be
>> >>> https://domain.com/download/book1 or
>> >>> https://download.domain.com/?get=book1 , as long as a request returns
>> >>> a payload with content-type application/pwp+zip (for example). Note
>> >>> that the book1.pwp archive in my example would contain the "main entry
>> >>> point" manifest.json (thus why I made a parallel above with EPUB
>> >>> container.xml or package.opf)
>> >>>
>> >>> The "unpacked" URL path
>> >>> https://domain.com/another/path/to/book1/manifest.json does not have
>> >>> to represent the actual file structure on the server, but it's a
>> >>> useful syntactical convention because other resource files in the PWP
>> >>> would probably have similarly-rooted relative locator paths (against a
>> >>> given base href), e.g.:
>> >>> https://domain.com/another/path/to/book1/index.html
>> >>> https://domain.com/another/path/to/book1/images/logo.png
>> >>> In other words, if the packed book1.pwp contains index.html with <img
>> >>> src="./images/logo.png" />, it does make sense for the online unpacked
>> >>> state to use the same path references (as per the example URLs above).
>> >>> Publishers may have the option to route URLs any way they like, e.g.
>> >>> <img src="?get_image=logo.png" />, but we know there is the issue of
>> >>> mapping document URLs in packed/unpacked states with some canonical
>> >>> locator, so that annotation targets can be referenced and resolved
>> >>> consistently. So it would greatly help if the file structure inside
>> >>> the packed book1.pwp was replicated exactly in the URL patterns used
>> >>> for deploying the unpacked state.
>> >>>
>> >>> To conclude, I am probably missing something (Ivan and Leonard, you
>> >>> guys are ahead of the curve compared to me), but I hope I managed to
>> >>> convey useful arguments. Personally, as a developer involved in
>> >>> reading-system implementations, and as someone who would like to
>> >>> continue deploying content with minimal server-side requirements, I am
>> >>> not yet convinced that content negotiation is needed here. As an
>> >>> optional feature, sure, but not as the lowest common denominator.
>> >>>
>> >>> Thanks for listening :)
>> >>> Regards, Dan
>> >>>
>> >>>
>> >>>
>> >>> On Thu, Feb 18, 2016 at 12:04 PM, Ivan Herman <ivan@w3.org> wrote:
>> >>>> With the caveat that the minutes are always difficult to read
>> (Romain, that
>> >>>> is not your fault, it is the case for most of the minutes; I know
>> only a few
>> >>>> people who write perfect minutes, and I am certainly not among them)
>> maybe
>> >>>> some comments on my side. More about this next time we can all talk
>> >>>> (although it seems that this will only be in two weeks, due to the
>> Baltimore
>> >>>> EDUPUB meeting).
>> >>>>
>> >>>> First of all, this comment:
>> >>>>
>> >>>> [[[
>> >>>> rom: my issue is that the spec doesn't say "if Lu exists then L must
>> be Lu",
>> >>>> I think we should consider it
>> >>>> ]]]
>> >>>>
>> >>>> I do not see why we should say anything like that. It is of course
>> correct
>> >>>> that, in many cases, it makes a lot of sense to have Lu=L. But I do
>> not see
>> >>>> why we should restrict it this way. In general, the approach I tried
>> to
>> >>>> follow in my writeup is to be as permissive as possible and put the
>> minimum
>> >>>> possible hard requirements on the locator setup. It is probably
>> worth adding
>> >>>> a note in the text (or the more final text) that Lu may be equal to
>> L (in
>> >>>> fact, this may very well be a widely used approach) but I would not
>> want to
>> >>>> go beyond that.
>> >>>>
>> >>>> Then there is the whole issue about content negotiations… It seems
>> that we
>> >>>> have a disagreement on the value and usage of content negotiations.
>> I do not
>> >>>> agree with Daniel's statement that "in a RESTful API the URL would
>> >>>> consistently return the same content type". It is certainly not the
>> >>>> practice, nor should it be. Content negotiation is widely used when
>> tools
>> >>>> want to retrieve, for example the best syntax that encodes a
>> particular
>> >>>> information (typical example is in RDF land, where tools may or may
>> not have
>> >>>> parsers for a particular RDF serialization), this is how dbpedia is
>> set up
>> >>>> etc. (I did told you about the way RDF namespace documents are set
>> up on our
>> >>>> site, for example. It is pretty much general practice to do that.) I
>> must
>> >>>> admit I also do not agree with Daniel's remark on "content
>> negotiation based
>> >>>> on (sophisticated) HTTP headers sounds counter intuitive". Content
>> >>>> negotiations is certainly very intuitive to me...
>> >>>>
>> >>>> All that being said, and that is where maybe there is actually a
>> minor
>> >>>> disagreement between Leonard and I: I do not say that content
>> negotiation is
>> >>>> the only approach to set up a server storage. The text I wrote is
>> >>>> deliberately open ended insofar as it described what the client
>> expectation
>> >>>> is when that GET request is issued in general terms, and the choice
>> among
>> >>>> the various alternatives are all the server's. The list of possible
>> server
>> >>>> behaviours in the text are possible alternatives, instead of hard
>> >>>> requirements. The client is responsible in following the various
>> possible
>> >>>> paths and, maybe, we will have to describe those possibilities later
>> in more
>> >>>> details (precise usage of the LINK header, the <link> element, media
>> types,
>> >>>> etc), but that gives the liberty to set up the server the way the
>> publisher
>> >>>> wants. If we accept this approach, ie, that the client has some
>> complexity
>> >>>> to resolve in favour of a variety of possible server setups, then I
>> do not
>> >>>> think there is a major disagreement among us.
>> >>>>
>> >>>> Talk to you guys later…
>> >>>>
>> >>>> Ivan
>> >>>>
>> >>>> B.t.w., a more general and slightly philosophical comment: we should
>> not be
>> >>>> afraid of really using HTTP:-) The various header information in
>> both the
>> >>>> request and response headers of an HTTP request/response are very
>> rich and
>> >>>> sophisticated. There are many situations, on expiration dates, on
>> security,
>> >>>> etc, and of course content negotiations that can be expressed via
>> these HTTP
>> >>>> headers, and we should not shy away using those whenever we can and
>> it makes
>> >>>> sense. As I showed in one of may mails it is not that complex to set
>> up
>> >>>> (actually, and to be fair, setting up content negotiations is
>> probably the
>> >>>> more complex thing, I accept that).
>> >>>>
>> >>>> If you are interested by the various possibilities, this site may be
>> of
>> >>>> interest:
>> >>>>
>> >>>> https://github.com/dret/sedola/blob/master/MD/headers.md
>> >>>>
>> >>>>
>> >>>>
>> >>>> On 18 Feb 2016, at 09:24, Romain <rdeltour@gmail.com> wrote:
>> >>>>
>> >>>>
>> >>>> On 18 Feb 2016, at 02:49, Leonard Rosenthol <lrosenth@adobe.com>
>> wrote:
>> >>>>
>> >>>> Actually, the big issue that I took away from the minutes is that
>> ivan and I
>> >>>> are in agreement that content negotiation (via standard web
>> technique incl.
>> >>>> the Accept header) is the proper way for the client & server to
>> decide what
>> >>>> to return on the GET from the canonical locator.   Daniel, however,
>> appears
>> >>>> (from the minutes) to be promoting a completely different approach.
>> >>>>
>> >>>>
>> >>>> As stated before [1], I am absolutely not convinced that content
>> negotiation
>> >>>> is a good approach.
>> >>>> I want to upload a PWP tomorrow to a static file hosting service; if
>> conneg
>> >>>> is required I can't do that.
>> >>>>
>> >>>> More to the point: how to you GET the (manifest + Lu + Lp) info with
>> the
>> >>>> conneg solution? Maybe I just miss something.
>> >>>>
>> >>>> Finally, may I turn the question the other way around: what are the
>> benefits
>> >>>> of content negotiation for the canonical locator? (compared to an
>> >>>> alternative approach with explicit links in the GET answer (headers
>> or
>> >>>> payload).
>> >>>>
>> >>>> Thanks,
>> >>>> Romain.
>> >>>>
>> >>>> [1]
>> https://lists.w3.org/Archives/Public/public-digipub-ig/2016Jan/0136.html
>> >>>>
>> >>>>
>> >>>> Daniel, if you can explain why you want to do something different
>> from the
>> >>>> standard web/REST model, I’d like to understand.
>> >>>>
>> >>>> Leonard
>> >>>>
>> >>>> From: Romain <rdeltour@gmail.com>
>> >>>> Date: Wednesday, February 17, 2016 at 6:26 PM
>> >>>> To: Daniel Weck <daniel.weck@gmail.com>, Leonard Rosenthol
>> >>>> <lrosenth@adobe.com>
>> >>>> Cc: "DPUB mailing list (public-digipub-ig@w3.org)"
>> >>>> <public-digipub-ig@w3.org>, Tzviya Siegman <tsiegman@wiley.com>
>> >>>> Subject: Re: [dpub-loc] 20160217 minutes
>> >>>>
>> >>>> On 17 Feb 2016, at 23:12, Daniel Weck <daniel.weck@gmail.com> wrote:
>> >>>>
>> >>>> Hi Leonard, that's quite a bold statement, but I suspect the minutes
>> could
>> >>>> do with a few corrections.
>> >>>>
>> >>>> My bad if the minutes are inaccurate, please feel free to amend. It
>> was a
>> >>>> bit frustrating too: several times I wanted to talk or precise a
>> point but
>> >>>> was busy typing.
>> >>>>
>> >>>> At any rate, I look forward to the recap from you and Ivan at the
>> next
>> >>>> opportunity. PS: it was a small quorum on this concall, but I was
>> under the
>> >>>> impression that the participants agreed on the broad lines of your
>> proposal,
>> >>>> with only details to clarify.
>> >>>>
>> >>>> My impression is that participants generally agreed with the
>> presentation of
>> >>>> the issues and some principles. I believe that the main point that
>> is still
>> >>>> controversial is really what should be the answer to a GET on the
>> canonical
>> >>>> locator.
>> >>>>
>> >>>>> I think we need to go do this over again next week – which si
>> extremely
>> >>>>> unfortunate.
>> >>>>
>> >>>>
>> >>>> If I'm not mistaken Matt, Markus, Tzviya and I won't be able to
>> attend
>> >>>> (EDUPUB summit).
>> >>>>
>> >>>> Romain.
>> >>>>
>> >>>> Regards, Daniel
>> >>>>
>> >>>> On 17 Feb 2016 9:17 p.m., "Leonard Rosenthol" <lrosenth@adobe.com>
>> wrote:
>> >>>>>
>> >>>>> Sorry that I was unable to attend today, especially since the
>> discussion
>> >>>>> (based on the minutes) seems to completely undo all the work that
>> Ivan,
>> >>>>> myself and others did on the mailing list during the past week.
>>  The
>> >>>>> position presented by Daniel is the exact opposite of what Ivan’s
>> musings
>> >>>>> (adjusted based on mail conversations) presented.
>> >>>>>
>> >>>>> I think we need to go do this over again next week – which si
>> extremely
>> >>>>> unfortunate.
>> >>>>>
>> >>>>> Leonard
>> >>>>>
>> >>>>> Fro  "Siegman, Tzviya - Hoboken" <tsiegman@wiley.com>
>> >>>>> Date: Wednesday, February 17, 2016 at 11:46 AM
>> >>>>> To: "DPUB mailing list (public-digipub-ig@w3.org)"
>> >>>>> <public-digipub-ig@w3.org>
>> >>>>> Subject: [dpub-loc] 20160217 minutes
>> >>>>> Resent-From: <public-digipub-ig@w3.org>
>> >>>>> Resent-Date: Wednesday, February 17, 2016 at 11:48 AM
>> >>>>>
>> >>>>> Minutes from today’s meeting:
>> >>>>> https://www.w3.org/2016/02/17-dpub-loc-minutes.html
>> >>>>>
>> >>>>> Tzviya Siegman
>> >>>>> Digital Book Standards & Capabilities Lead
>> >>>>> Wiley
>> >>>>> 201-748-6884
>> >>>>> tsiegman@wiley.com
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> ----
>> >>>> Ivan Herman, W3C
>> >>>> Digital Publishing Lead
>> >>>> Home: http://www.w3.org/People/Ivan/
>> >>>> mobile: +31-641044153
>> >>>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>
>> >>
>> >> ----
>> >> Ivan Herman, W3C
>> >> Digital Publishing Lead
>> >> Home: http://www.w3.org/People/Ivan/
>> >> mobile: +31-641044153
>> >> ORCID ID: http://orcid.org/0000-0003-0782-2704
>> >>
>> >>
>> >>
>> >>
>> >
>>
>>
>> ----
>> Ivan Herman, W3C
>> Digital Publishing Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> ORCID ID: http://orcid.org/0000-0003-0782-2704
>>
>>
>>
>>
>>
>
>
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704
>
>
>
>
>
>
> ----
> Ivan Herman, W3C
> Digital Publishing Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> ORCID ID: http://orcid.org/0000-0003-0782-2704
>
>
>
>
>

Received on Monday, 22 February 2016 13:50:58 UTC