- From: Ben De Meester <ben.demeester@ugent.be>
- Date: Mon, 22 Feb 2016 14:50:04 +0100
- To: Ivan Herman <ivan@w3.org>
- Cc: Leonard Rosenthol <lrosenth@adobe.com>, Romain <rdeltour@gmail.com>, Daniel Weck <daniel.weck@gmail.com>, W3C Digital Publishing IG <public-digipub-ig@w3.org>
- Message-ID: <CAJ-O9TuGZs5gH6fC6oAJs=4ocq5PdrGobQiRqYeRCebL0LmMeA@mail.gmail.com>
2016-02-22 14:15 GMT+01:00 Ivan Herman <ivan@w3.org>: > > On 21 Feb 2016, at 16:00, Leonard Rosenthol <lrosenth@adobe.com> wrote: > > Ivan – what I thought we had agreed on is that there are two types of > POSSIBLE PWP Processors – Server and Client. Any specific implementation > of a PWP can consist of various combinations of the two. > > > I am not sure we really agreed on that, insofar is that I believe we > should avoid Server side processors; that what I was arguing with you on > the phone… I agree there may be situations where something like that may > become necessary, but I regard this is a niche case. The diagram's goal is > to provide a more typical case where, in my view, a server configuration is > the maximum we should go for… > In that case, isn't a 'well-configured Apache server' an example of a Server PWP processor? > > Ivan > > > It is just as acceptable to have a smart server/dumb client configuration > as it is to have a dumb server/smart client. > > As such, I support Ben’s approach of creating the separation of concepts. > And having some POSSIBLE implementation options of how that two might work > together seems like a good idea. But we need to keep in mind that we are > not mandating/proscribing specific implementation – just one set of > possible ones. > > Leonard > > From: Ivan Herman <ivan@w3.org> > Date: Sunday, February 21, 2016 at 4:30 AM > To: Ben De Meester <ben.demeester@ugent.be> > Cc: Romain <rdeltour@gmail.com>, Daniel Weck <daniel.weck@gmail.com>, > Leonard Rosenthol <lrosenth@adobe.com>, W3C Digital Publishing IG < > public-digipub-ig@w3.org> > Subject: Re: [dpub-loc] 20160217 minutes > > Hi Ben, > > thanks for this but… I am not 100% sure this helps. > > First of all, I would prefer not to refer to a Server PWP Processor. This > suggests that there is a need for a very specific Server to be used with > PWP, which is something we should avoid. There *may* be, for convenience, > ways to set up a server with standard configuration facilities in, say, > Apache, but they do not constitute a 'processor' and they are by no means > required. > > What we will have to specify in more details is, actually, the Client side > PWP processor. For that purpose, if we want to give some visual > representation (which I believe is a good idea), a standard flow chart > seems to be a much better approach. If I find some time, I will try to come > up with one (but you can of course try to beat me into it:-) > > Cheers > > Ivan > > > > On 18 Feb 2016, at 18:08, Ben De Meester <ben.demeester@ugent.be> wrote: > > Hi all, > > I think we are actually all in massive agreement, but it's just a matter > of having a minimal conforming system vs enhancements. > In http://w3c.github.io/dpub-pwp-loc/drafts/minimal-server.seq.violet.html, > I tried to draw a flow chart what would happen if we would have the most > minimally configured server (i.e., a file server). > In http://w3c.github.io/dpub-pwp-loc/drafts/conneg.seq.violet.html, I > tried to show what would happen if the server allowed conneg: there would > be one request less to the server, so it would be more efficient, but the > first example does not exclude the other or vice versa. > Other improvements are possible as well, an entire spectrum of complex > client vs complex server can be researched. > > Also, I added on the figures the definition that M is a combination of > Mmanifest and Mlinkset. > > It would be great if we could agree on *something like* > http://w3c.github.io/dpub-pwp-loc/drafts/minimal-server.seq.violet.html > as a baseline (and of course specify the details better), and allow for > (and describe) improvements where possible. > > Does this look like a good way to move forward? > > Greetings, > Ben > > Ben De Meester > Researcher Semantic Web > Ghent University - iMinds - Data Science Lab | Faculty of Engineering and > Architecture | Department of Electronics and Information Systems > Sint-Pietersnieuwstraat 41, 9000 Ghent, Belgium > t: +32 9 331 49 59 | e: ben.demeester@ugent.be | URL: > http://users.ugent.be/~bjdmeest/ > > 2016-02-18 17:59 GMT+01:00 Ivan Herman <ivan@w3.org>: > >> >> > On 18 Feb 2016, at 16:40, Romain <rdeltour@gmail.com> wrote: >> > >> > >> >> On 18 Feb 2016, at 15:34, Ivan Herman <ivan@w3.org> wrote: >> >> >> >> Daniel, >> >> >> >> to be honest, I am not sure what you are arguing for or against… >> >> >> >> - The fact that the unpacked and packed versions would/should reflect, >> conceptually, the same file hierarchy: I do not have any problem with that. >> Although we could imagine having some sort of a 'mapping table' in the PWP >> manifest to convert among URLs from one state or the other, I do not think >> that is really all that useful. However, I do not think anything in the >> current writeups contradicts this; in fact, I believe this issue is pretty >> much orthogonal on the choice of the Lu, L, Lp, and the relationships among >> them. >> > >> > Right. >> > >> >> >> >> - I did not say that 'content negotiation is the lowest common >> denominator'. It is one of the possible approaches. I happen to think it is >> useful and good to have it, others have a different view; that is fine. The >> only thing in the text is: "The answer to HTTP Get >> http://book.org/published-books/1 must make M available to the PWP >> Processor". >> > >> > I think we have a consensus on this statement, which is a good start :) >> > >> > Also, I don't think that Lp and Lu are part of M (correct?), so do we >> agree about extending the statement to : >> > >> > "The answer to HTTP Get http://book.org/published-books/1 must make >> M, Lp, and Lu available to the PWP Processor". >> >> Essentially yes, although my formulation would be slightly different. >> This was a detail that Leonard and I discussed; the way I would prefer to >> formulate is in[1], essentially saying that M is a conceptual entity that >> does include the L-s and the PWP processor combines the various sources of >> information to glean everything it contains (including the Lp and Lu >> values). Ie, in practice, the processor may receive part of the information >> from the manifest file in the packaged version, and some through the LINK >> header. >> >> I have not yet changed the text accordingly. >> >> [1] >> https://lists.w3.org/Archives/Public/public-digipub-ig/2016Feb/0093.html >> >> >> > >> > >> >> The way to honour that commitment may include several approaches >> which, if we were writing a standard, would be the only normative >> statements and are listed (for the time being, there may be more) in the >> four bullet items as alternatives: >> >> >> >> • M itself (e.g., a JSON file, and RDFa+HTML file, etc., whatever >> is specified for the exact format and media type of M at some point); or >> >> • a package in some predefined PWP format that must include M; or >> >> • an HTML, SVG, or other resource, representing, e.g., the cover >> page of the publication, with M referred to in the Link header of the HTTP >> Response; or >> >> • an (X)HTML file containing the <link> element referring to M >> > >> > OK. >> > >> >> >> >> Nothing here prescribes a specific server setup. Again, in standard >> specification parlance, all the various server setup possibilities are >> informative and not normative. >> > >> > I'm not sure I agree. IMO the mere consensual statement above (whilst >> important) is not enough; at some point we'll need to be more precise than >> that. >> > Well, this depends on the scope/objectives of the TF… >> >> But I certainly believe that we should not (even if we are normative) >> require one and only one possible server setup. I would _not_ require to >> use content negotiation as the only mechanism, but I would equally _not_ >> require a mechanism that makes content negotiation impossible or unused. >> There should be several scenarios the server maintainers could choose from. >> Whether such a list should be standard, whether such list should be >> exhaustive; I do not know. My gut feeling is neither… Because we do not >> produce anything normative, that is actually for later anyway. >> >> Ivan >> >> > >> > Romain. >> > >> >> >> >> Ivan >> >> >> >> P.S. I am also not fully sure what you want to show with the github >> example, I must admit. But it seems to reflect a particular github >> (server:-) setup. Let me give another example: you can run the following >> curl-s: >> >> >> >> curl --head http://www.w3.org/ns/oa >> >> curl --head --header "Accept: application/ld+json" >> http://www.w3.org/ns/oa >> >> curl --head --header "Accept: text/turtle" http://www.w3.org/ns/oa >> >> >> >> these will return the same conceptual content (a vocabulary) in HTML >> (with the vocabulary in RDFa), in JSON-LD, or in turtle, using the same >> canonical URL for the vocabulary itself. This requires a different server >> setup. >> >> >> >> >> >> >> >> >> >>> On 18 Feb 2016, at 14:04, Daniel Weck <daniel.weck@gmail.com> wrote: >> >>> >> >>> Hello, >> >>> >> >>> here's a concrete example (unrelated to PWP) which I think illustrates >> >>> the comments made during the concall, regarding content negotiation >> >>> vs. dereferencing URL endpoints to "meta" data about the publication >> >>> locators for unpacked / packed states. >> >>> >> >>> Let's consider the GitHub HTTP API, the w3c/dpub-pwp-loc GitHub >> >>> repository, and the README.md file located at the root of the >> >>> gh-branch. There's a "canonical" URL for that (you can safely click on >> >>> the links below): >> >>> >> >>> curl --head https://api.github.com/repos/w3c/dpub-pwp-loc/readme >> >>> ==> Content-Type: application/json; charset=utf-8 >> >>> >> >>> curl https://api.github.com/repos/w3c/dpub-pwp-loc/readme >> >>> ==> "url": " >> https://api.github.com/repos/w3c/dpub-pwp-loc/contents/README.md?ref=gh-pages >> " >> >>> >> >>> As a consumer of that JSON-based API, I can query the actual payload >> >>> that I'm interested in: >> >>> curl >> https://api.github.com/repos/w3c/dpub-pwp-loc/contents/README.md?ref=gh-pages >> >>> ==> "content": "BASE64" >> >>> >> >>> >> >>> Now, back to PWP: >> >>> >> >>> State-agnostic "canonical" URL: >> >>> https://domain.com/path/to/book1 >> >>> (note that this could also be a totally different syntax, e.g. >> >>> https://domain.com/info/?get=book1 or >> >>> https://domain.com/book1?get=info etc. for as long as a request >> >>> returns a content-type that a PWP processor / reading-system can >> >>> consume, e.g. application/json or application/pwp-info+json ... or XML >> >>> / whatever) >> >>> A simple request to this URL could return (minimal JSON example, just >> >>> for illustration purposes): >> >>> { >> >>> "packed": "https://domain.com/path/to/book1.pwp", >> >>> "unpacked": >> >>> "https://domain.com/another/path/to/book1/manifest.json" /// (or >> >>> container.xml, or package.opf ... :) >> >>> } >> >>> >> >>> Once again, there is no naming convention / constraint on the "packed" >> >>> URL https://domain.com/path/to/book1.pwp which could be >> >>> https://domain.com/download/book1 or >> >>> https://download.domain.com/?get=book1 , as long as a request returns >> >>> a payload with content-type application/pwp+zip (for example). Note >> >>> that the book1.pwp archive in my example would contain the "main entry >> >>> point" manifest.json (thus why I made a parallel above with EPUB >> >>> container.xml or package.opf) >> >>> >> >>> The "unpacked" URL path >> >>> https://domain.com/another/path/to/book1/manifest.json does not have >> >>> to represent the actual file structure on the server, but it's a >> >>> useful syntactical convention because other resource files in the PWP >> >>> would probably have similarly-rooted relative locator paths (against a >> >>> given base href), e.g.: >> >>> https://domain.com/another/path/to/book1/index.html >> >>> https://domain.com/another/path/to/book1/images/logo.png >> >>> In other words, if the packed book1.pwp contains index.html with <img >> >>> src="./images/logo.png" />, it does make sense for the online unpacked >> >>> state to use the same path references (as per the example URLs above). >> >>> Publishers may have the option to route URLs any way they like, e.g. >> >>> <img src="?get_image=logo.png" />, but we know there is the issue of >> >>> mapping document URLs in packed/unpacked states with some canonical >> >>> locator, so that annotation targets can be referenced and resolved >> >>> consistently. So it would greatly help if the file structure inside >> >>> the packed book1.pwp was replicated exactly in the URL patterns used >> >>> for deploying the unpacked state. >> >>> >> >>> To conclude, I am probably missing something (Ivan and Leonard, you >> >>> guys are ahead of the curve compared to me), but I hope I managed to >> >>> convey useful arguments. Personally, as a developer involved in >> >>> reading-system implementations, and as someone who would like to >> >>> continue deploying content with minimal server-side requirements, I am >> >>> not yet convinced that content negotiation is needed here. As an >> >>> optional feature, sure, but not as the lowest common denominator. >> >>> >> >>> Thanks for listening :) >> >>> Regards, Dan >> >>> >> >>> >> >>> >> >>> On Thu, Feb 18, 2016 at 12:04 PM, Ivan Herman <ivan@w3.org> wrote: >> >>>> With the caveat that the minutes are always difficult to read >> (Romain, that >> >>>> is not your fault, it is the case for most of the minutes; I know >> only a few >> >>>> people who write perfect minutes, and I am certainly not among them) >> maybe >> >>>> some comments on my side. More about this next time we can all talk >> >>>> (although it seems that this will only be in two weeks, due to the >> Baltimore >> >>>> EDUPUB meeting). >> >>>> >> >>>> First of all, this comment: >> >>>> >> >>>> [[[ >> >>>> rom: my issue is that the spec doesn't say "if Lu exists then L must >> be Lu", >> >>>> I think we should consider it >> >>>> ]]] >> >>>> >> >>>> I do not see why we should say anything like that. It is of course >> correct >> >>>> that, in many cases, it makes a lot of sense to have Lu=L. But I do >> not see >> >>>> why we should restrict it this way. In general, the approach I tried >> to >> >>>> follow in my writeup is to be as permissive as possible and put the >> minimum >> >>>> possible hard requirements on the locator setup. It is probably >> worth adding >> >>>> a note in the text (or the more final text) that Lu may be equal to >> L (in >> >>>> fact, this may very well be a widely used approach) but I would not >> want to >> >>>> go beyond that. >> >>>> >> >>>> Then there is the whole issue about content negotiations… It seems >> that we >> >>>> have a disagreement on the value and usage of content negotiations. >> I do not >> >>>> agree with Daniel's statement that "in a RESTful API the URL would >> >>>> consistently return the same content type". It is certainly not the >> >>>> practice, nor should it be. Content negotiation is widely used when >> tools >> >>>> want to retrieve, for example the best syntax that encodes a >> particular >> >>>> information (typical example is in RDF land, where tools may or may >> not have >> >>>> parsers for a particular RDF serialization), this is how dbpedia is >> set up >> >>>> etc. (I did told you about the way RDF namespace documents are set >> up on our >> >>>> site, for example. It is pretty much general practice to do that.) I >> must >> >>>> admit I also do not agree with Daniel's remark on "content >> negotiation based >> >>>> on (sophisticated) HTTP headers sounds counter intuitive". Content >> >>>> negotiations is certainly very intuitive to me... >> >>>> >> >>>> All that being said, and that is where maybe there is actually a >> minor >> >>>> disagreement between Leonard and I: I do not say that content >> negotiation is >> >>>> the only approach to set up a server storage. The text I wrote is >> >>>> deliberately open ended insofar as it described what the client >> expectation >> >>>> is when that GET request is issued in general terms, and the choice >> among >> >>>> the various alternatives are all the server's. The list of possible >> server >> >>>> behaviours in the text are possible alternatives, instead of hard >> >>>> requirements. The client is responsible in following the various >> possible >> >>>> paths and, maybe, we will have to describe those possibilities later >> in more >> >>>> details (precise usage of the LINK header, the <link> element, media >> types, >> >>>> etc), but that gives the liberty to set up the server the way the >> publisher >> >>>> wants. If we accept this approach, ie, that the client has some >> complexity >> >>>> to resolve in favour of a variety of possible server setups, then I >> do not >> >>>> think there is a major disagreement among us. >> >>>> >> >>>> Talk to you guys later… >> >>>> >> >>>> Ivan >> >>>> >> >>>> B.t.w., a more general and slightly philosophical comment: we should >> not be >> >>>> afraid of really using HTTP:-) The various header information in >> both the >> >>>> request and response headers of an HTTP request/response are very >> rich and >> >>>> sophisticated. There are many situations, on expiration dates, on >> security, >> >>>> etc, and of course content negotiations that can be expressed via >> these HTTP >> >>>> headers, and we should not shy away using those whenever we can and >> it makes >> >>>> sense. As I showed in one of may mails it is not that complex to set >> up >> >>>> (actually, and to be fair, setting up content negotiations is >> probably the >> >>>> more complex thing, I accept that). >> >>>> >> >>>> If you are interested by the various possibilities, this site may be >> of >> >>>> interest: >> >>>> >> >>>> https://github.com/dret/sedola/blob/master/MD/headers.md >> >>>> >> >>>> >> >>>> >> >>>> On 18 Feb 2016, at 09:24, Romain <rdeltour@gmail.com> wrote: >> >>>> >> >>>> >> >>>> On 18 Feb 2016, at 02:49, Leonard Rosenthol <lrosenth@adobe.com> >> wrote: >> >>>> >> >>>> Actually, the big issue that I took away from the minutes is that >> ivan and I >> >>>> are in agreement that content negotiation (via standard web >> technique incl. >> >>>> the Accept header) is the proper way for the client & server to >> decide what >> >>>> to return on the GET from the canonical locator. Daniel, however, >> appears >> >>>> (from the minutes) to be promoting a completely different approach. >> >>>> >> >>>> >> >>>> As stated before [1], I am absolutely not convinced that content >> negotiation >> >>>> is a good approach. >> >>>> I want to upload a PWP tomorrow to a static file hosting service; if >> conneg >> >>>> is required I can't do that. >> >>>> >> >>>> More to the point: how to you GET the (manifest + Lu + Lp) info with >> the >> >>>> conneg solution? Maybe I just miss something. >> >>>> >> >>>> Finally, may I turn the question the other way around: what are the >> benefits >> >>>> of content negotiation for the canonical locator? (compared to an >> >>>> alternative approach with explicit links in the GET answer (headers >> or >> >>>> payload). >> >>>> >> >>>> Thanks, >> >>>> Romain. >> >>>> >> >>>> [1] >> https://lists.w3.org/Archives/Public/public-digipub-ig/2016Jan/0136.html >> >>>> >> >>>> >> >>>> Daniel, if you can explain why you want to do something different >> from the >> >>>> standard web/REST model, I’d like to understand. >> >>>> >> >>>> Leonard >> >>>> >> >>>> From: Romain <rdeltour@gmail.com> >> >>>> Date: Wednesday, February 17, 2016 at 6:26 PM >> >>>> To: Daniel Weck <daniel.weck@gmail.com>, Leonard Rosenthol >> >>>> <lrosenth@adobe.com> >> >>>> Cc: "DPUB mailing list (public-digipub-ig@w3.org)" >> >>>> <public-digipub-ig@w3.org>, Tzviya Siegman <tsiegman@wiley.com> >> >>>> Subject: Re: [dpub-loc] 20160217 minutes >> >>>> >> >>>> On 17 Feb 2016, at 23:12, Daniel Weck <daniel.weck@gmail.com> wrote: >> >>>> >> >>>> Hi Leonard, that's quite a bold statement, but I suspect the minutes >> could >> >>>> do with a few corrections. >> >>>> >> >>>> My bad if the minutes are inaccurate, please feel free to amend. It >> was a >> >>>> bit frustrating too: several times I wanted to talk or precise a >> point but >> >>>> was busy typing. >> >>>> >> >>>> At any rate, I look forward to the recap from you and Ivan at the >> next >> >>>> opportunity. PS: it was a small quorum on this concall, but I was >> under the >> >>>> impression that the participants agreed on the broad lines of your >> proposal, >> >>>> with only details to clarify. >> >>>> >> >>>> My impression is that participants generally agreed with the >> presentation of >> >>>> the issues and some principles. I believe that the main point that >> is still >> >>>> controversial is really what should be the answer to a GET on the >> canonical >> >>>> locator. >> >>>> >> >>>>> I think we need to go do this over again next week – which si >> extremely >> >>>>> unfortunate. >> >>>> >> >>>> >> >>>> If I'm not mistaken Matt, Markus, Tzviya and I won't be able to >> attend >> >>>> (EDUPUB summit). >> >>>> >> >>>> Romain. >> >>>> >> >>>> Regards, Daniel >> >>>> >> >>>> On 17 Feb 2016 9:17 p.m., "Leonard Rosenthol" <lrosenth@adobe.com> >> wrote: >> >>>>> >> >>>>> Sorry that I was unable to attend today, especially since the >> discussion >> >>>>> (based on the minutes) seems to completely undo all the work that >> Ivan, >> >>>>> myself and others did on the mailing list during the past week. >> The >> >>>>> position presented by Daniel is the exact opposite of what Ivan’s >> musings >> >>>>> (adjusted based on mail conversations) presented. >> >>>>> >> >>>>> I think we need to go do this over again next week – which si >> extremely >> >>>>> unfortunate. >> >>>>> >> >>>>> Leonard >> >>>>> >> >>>>> Fro "Siegman, Tzviya - Hoboken" <tsiegman@wiley.com> >> >>>>> Date: Wednesday, February 17, 2016 at 11:46 AM >> >>>>> To: "DPUB mailing list (public-digipub-ig@w3.org)" >> >>>>> <public-digipub-ig@w3.org> >> >>>>> Subject: [dpub-loc] 20160217 minutes >> >>>>> Resent-From: <public-digipub-ig@w3.org> >> >>>>> Resent-Date: Wednesday, February 17, 2016 at 11:48 AM >> >>>>> >> >>>>> Minutes from today’s meeting: >> >>>>> https://www.w3.org/2016/02/17-dpub-loc-minutes.html >> >>>>> >> >>>>> Tzviya Siegman >> >>>>> Digital Book Standards & Capabilities Lead >> >>>>> Wiley >> >>>>> 201-748-6884 >> >>>>> tsiegman@wiley.com >> >>>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> ---- >> >>>> Ivan Herman, W3C >> >>>> Digital Publishing Lead >> >>>> Home: http://www.w3.org/People/Ivan/ >> >>>> mobile: +31-641044153 >> >>>> ORCID ID: http://orcid.org/0000-0003-0782-2704 >> >>>> >> >>>> >> >>>> >> >>>> >> >> >> >> >> >> ---- >> >> Ivan Herman, W3C >> >> Digital Publishing Lead >> >> Home: http://www.w3.org/People/Ivan/ >> >> mobile: +31-641044153 >> >> ORCID ID: http://orcid.org/0000-0003-0782-2704 >> >> >> >> >> >> >> >> >> > >> >> >> ---- >> Ivan Herman, W3C >> Digital Publishing Lead >> Home: http://www.w3.org/People/Ivan/ >> mobile: +31-641044153 >> ORCID ID: http://orcid.org/0000-0003-0782-2704 >> >> >> >> >> > > > ---- > Ivan Herman, W3C > Digital Publishing Lead > Home: http://www.w3.org/People/Ivan/ > mobile: +31-641044153 > ORCID ID: http://orcid.org/0000-0003-0782-2704 > > > > > > > ---- > Ivan Herman, W3C > Digital Publishing Lead > Home: http://www.w3.org/People/Ivan/ > mobile: +31-641044153 > ORCID ID: http://orcid.org/0000-0003-0782-2704 > > > > >
Received on Monday, 22 February 2016 13:50:58 UTC