- From: Benjamin Armintor <armintor@gmail.com>
- Date: Wed, 17 Sep 2014 13:33:55 -0400
- To: John Arwe <johnarwe@us.ibm.com>
- Cc: Austin William Wright <aaa@bzfx.net>, kjetil@kjernsmo.net, public-ldp-comments@w3.org
- Message-ID: <CADQQ8TNO1AsvSDtkX3RK5sG1B6wmtFBgyJ76k+716Tf01D-B7Q@mail.gmail.com>
Thanks for the detailed reply, John; it's much appreciated. I think a couple of the points of the proposal vis-a-vis alternatives might overstate it's advantages: - etag and conditional GET are, as far as I know, available to range requests, so the same mechanism for change detection would be available to a hypothetical range-based approach - The 2NN status might wreak less havoc with caches than alternatives, but I'm not confident it will be a lot less - I understand that RDF stores needn't provide predictable order for triples, but don't personally follow how a such a resource's triples could then be paged in any useful way, which seems to bring us back to content negotiation. But, I am also engaging in a relatively shallow effort to think about this stuff. The ability of a server to unilaterally impose paging is one that Range requests is not a thing an approach built around Range could do without being pretty hostile to pre-HTTP 1.1 clients. - Ben On Wed, Sep 17, 2014 at 12:25 PM, John Arwe <johnarwe@us.ibm.com> wrote: > Austin, the working group asked me to reply to your comment. I'm the > default RFC monger in this working group ;-) > > > Based on the volume of discussion we've had in the past, which includes > within the working group in addition to liasing with other groups such as > the W3C TAG and the IETF HTTP working group, such a comparison is highly > unlikely to be small enough to non-disruptively fit in the introduction/etc > of the document. > > If you have some alternatives to the reasoning below not covered here, > please share as it's conceivably new information that would alter consensus > opinions. > > > > wrt next/prev, LDP Paging does make use of them, and in general draws > substantial inspiration from [5005], specifically from section 3 Paged > Feeds as the definitions (that non-normatively refer to 5005) should make > clear. Note that the 5005 link relation definitions are no longer the > latest; the current normative definitions in the link relation registry [1] > are compatible with LDP Paging's usage, although re-reading 5005 I see no > conflicts. > > wrt *-archive link relations, they constrain the archive documents that > their target URIs identify such that their state SHOULD NOT change over > time, which is not a constraint that the working group believes is > appropriate for LDP Paging in-sequence pages (6.2.9) ... keep in mind too > the 2119 definition of Should Not, versus the developer attitude of "should > == may". More problematically, RFC 5005 section 4's constraints include > (again, Should) specific content (fh:archive) "elements" in the resources' > "head sections" ... in effect, binding archive documents to Atom > Syndication Format and hence to XML; this in turn means that for resources > that are RDF graphs (a central concern for LDP), there is no standard > representation format (no standardized ASF serialization exists for RDF). > > In both cases, reaching more deeply into 5005 and drawing a 1:1 > correspondence from (for example) feed entries to RDF triples would cause > additional impedance mismatches. If LDP Patch comes to fruition, then that > might provide a good match (I'm speaking speculatively here and purely for > myself - there have been zero working group discussions along these lines > that I am aware of). The idea of reconstructing a logical feed using a > time-sequenced set of incremental patch entries seems like a natural > application of 5005. Agreement on an LDP Patch format has proven to be a > stubbornly elusive goal over the lifetime of the working group, although it > has recently made progress. > > > > > wrt adding new Range units, various working group members have looked at > it several times over the life of the working group; personally, I did so > as far back as Submission-drafting time. The primary reasons that worked > against re-use of Range were: > > 1: Servers are not free to initiate paging unilaterally using Range > requests. The ability for the server to initiate paging as a way to manage > server load (and as a side effect, potential attacks) is a major concern of > the working group members. > > 2: RDF based resources (a focus of LDP generally) are not seen to be > amenable to range requests that require index-based access to triples, > absent implementation or domain-specific assumptions about underlying > ordering. SQL-based back ends might be amenable to "counting triples", but > other database technologies not so. Then there is the issue of common RDF > implementation components like Apache Jena, that faithfully implement the > RDF graph definition of an unordered set ... therefore providing no > interface-level guarantees of repeatable order in model traversal > operations or serialization operations, even if the underlying graph were > unchanged between requests. Requiring all implementations to impose an > index-based ordering on triples is seen as a significant implementation > burden. > > 3: The inability for clients to have any guarantees about their view of a > paged resource's state after a traversal in which the paged resource > changes. LDP Paging provides a stronger guarantee in 6.2.7 for paged > resources in the latter case than Range or 5005 would guarantee for a > archived feed once the equivalence to RDF is established (preceding point). > The (my) initial proposal started off with the "no guarantees, start over" > position of 5005, and working group members advocated for the stronger > guarantee. > > 4: Non-cacheability of responses. Existing caches would be forced to > treat extension units as uncacheable, if and until their implementations > were updated to support the new LDP-defined units. > > FWIW, if a future spec were to standardize how clients request particular > orderings from the server, e.g. sorting of a result set, then in those > cases index-based triple access and new units (on Range and/or on LDP > Paging's preference) might well be specified there as well. > > > > wrt Content-Location and status code, this was an option that members of > the working group did discuss with the W3C TAG [2],[3]and the IETF HTTP > working group (their chair is cc'd on [3], as one example); short answer, > there was no broad consensus on whether or not doing what you suggest is > within HTTP, nor (if it is) that HTTP supplies an unambiguous and > semantically correct interpretation. > > 1: [4] says that in the case you describe the C-L URI identifies a > particular representation of the effective request URI. The LDP > established consensus that a single in-sequence page, in the general case, > is not *the same resource* (in the sense of "state") as the paged resource. > We did not have consensus that the definition cited allows the server to > respond to GET paged-resource-URI with 200 and C-L that identifies an > in-sequence page (which, definitionally, has only a subset of the paged > resource's state); my sense is that the working group mostly found that > interpretation unnatural. A client receiving a 200 response was believed > to have every right to stop there (at that first GET), believing it has the > *entire* state of the paged resource; this would not be true however when a > paged resource is identified by the effective request URI and an > in-sequence page resource is identified by the Content-Location response > header (in the general case of the paged resource having > 1 page). > > 2: There is a competing mindset that says the server says what is, so 200 > + C-L of a "subset" resource is perfectly fine: clients have to know > something about the resource they're asking for. > > LDP chose to specify an approach that leaves no risk of an existing client > incorrectly believing that it has a complete representation of the state of > the resource identified by the effective request URI when it does not, > given existing implementations. If consensus evolves in the wider > community over time, then LDP Paging might be able to incorporate whatever > optimizations become enabled, but the currently specified base should > continue to work unchanged, even if it has to start with 303 to be safe wrt > existing clients. The at-risk text between 6.2.5 and 6.2.6 contains > additional links as well. > > > > wrt RFC 5989, LDP's scope was chartered to include HTTP and RDF. I don't > know that anyone in the working group was deeply aware of 5989 before your > comment. There was no appetite for adding a requirement on RLS or SIP for > implementations. > > > > wrt If clients have to be "paging aware", would that ... There are > several cases to consider, given the optional features involved. > > 1: If any GET request results in a 2NN response with response headers Link > type=ldp:Page and canonical=effective request URI, then it can choose to > retrieve the page sequence or not. According to the 2NN draft [4], this > would never happen with a compliant server unless the client sends an > indication in the request that it supports 2NN responses, in keeping with > "leaves no risk of an existing client incorrectly believing ..." > > 2: If any GET request results in a 303 response, the semantics of 303 > already say that a second resource than the one identified by the effective > request URI is involved (thus: 303, not 306 or 307). If the client chooses > to retrieve the 303 Location response header's resource, and that response > has response headers Link type=ldp:Page and canonical=first request's > effective request URI, then it can choose to retrieve the page sequence or > not. > > Any client can do that, on any resource. Within the working group, a > common supposition has been that an http client library would do this > transparently. If you see any "external/pre-programmed notion of what the > resource it gets back is going to be", please point it out. It's > conceivable that those involved are too close to it to see some subtlety, > but having looked again we see no such requirement. Indeed, we see *less* > need for outside knowledge in this approach than in some alternatives > suggested, for example 200 + Content-Location, which is why we obtained > consensus on it. > > > > wrt scope of applicability > > Indeed, we separated Paging out in part to allow its application > independently of LDP proper. Along the way, the language was changed so > that it applies to more than just RDF based resources. > Are there any particular aspects of LDP that you believe your server would > not comply with, or is the definitional normative requirement on being an > LDP server coupled with the size of the LDP spec simply leading you to > assume that you're not compliant? The bare-minimum difference between a > compliant LDP server and a conforming HTTP server is pretty small, IIRC - > skimming it's 4.2.1.3 etags, 4.2.1.5 default base URI, done (assuming you > don't intend to expose LDPRs or LDPCs, but we're talking about bare-min) . > LDP, for example, does not require you to host RDF at all or to deal with > containers at all. If your question stems in part from a "follow your nose > - oh, a different big scary spec I have to grep through in order to use > Paging at all, how 'nice'" reaction, that is something we could clarify in > principle. > > As to other groups, as mentioned above we've engaged directly with the TAG > and IETF HTTP on certain aspects, as well as co-membership with the RDF > working group, and we've received comments on past LDP LC drafts (which did > include Paging at first) announced the usual way over the span of a year > from a variety of sources including Tim Berners-Lee. If there are specific > communities you have in mind to solicit that we might have omitted, this is > a perfect time to get them reading and we'd appreciate your help in > motivating them to comment within the review period. > > > conneg > > I think that got covered above in the context of other comments; the TAG > (and IETF's HTTP working group) have already seen and given comments on > 2NN. It was one thread off the TAG discussion that led to additional uses > (outside of LDP) for 2NN, as documented in the IETF draft. > > > > [1] http://www.iana.org/assignments/link-relations/link-relations.xml > [2] http://lists.w3.org/Archives/Public/www-tag/2013Dec/0041.html > [3] http://lists.w3.org/Archives/Public/www-tag/2014Jan/0013.html > [4] http://tools.ietf.org/html/rfc7231#section-3.1.4.2 > [5005] http://tools.ietf.org/html/rfc5005 > > Best Regards, John > > Voice US 845-435-9470 BluePages > <http://w3.ibm.com/jct03019wt/bluepages/simpleSearch.wss?searchBy=Internet+address&location=All+locations&searchFor=johnarwe> > Cloud and Smarter Infrastructure OSLC Lead >
Received on Wednesday, 17 September 2014 17:34:24 UTC