- From: Lisa Dusseault <lisa@xythos.com>
- Date: Wed, 2 Oct 2002 09:29:27 -0700
- To: "'Stefan Eissing'" <stefan.eissing@greenbytes.de>, "'Jason Crawford'" <nn683849@smallcue.com>
- Cc: "'Webdav WG'" <w3c-dist-auth@w3c.org>
> I read three proposals: > 1. more clearly define the If header and its scope. > 2. Introduce an additional request header for relaxed lock > checking. The client says: " this is the bag of lock tokens > I have, server, take your pick." > 3. Introduce an additional response header where the server > can indicated which token from the bag are no longer valid. > > I am all for (1). > > Regarding (2) and (3) I argued that, as we still need the If header, is > it really worth it to add a new header with related semantics? I am now convinced (I wasn't before the interop events) that it is worthwhile to add a new and simple header. Although we could spend a lot of time defining the IF header more carefully (and may need to anyway), and we could also document how to use the IF header to supply lock tokens without causing the request to fail because of condition matching, we have continued interoperability problems with every new client implementation when they encounter odd situations and different server implementations. Because the IF header is so verbose, some of the solutions are not feasible in practice. A simple (comma-delimitered) header for providing lock tokens would vastly improve interoperability, and that's after all the goal for RFC2518 bis. The original lock/if design had a lot of good ideas and definitely good intentions, but with any complex design, the Law of Unintended Consequences means there will be Unintended Consequences. After implementation experience, we know what those are. Maybe we can fix them. That was the summary, here are the details. Note that I do not mention authentication below, in order to keep the discussion simple. Any time I discuss using a lock token to see if the write operation is allowed, assume the server may also apply authentication checks to make sure the user can use the lock. A. What are the continued interoperability problems? Basically, clients find their requests failing because of two major cases: (1) they do not supply all the correct lock tokens (not knowing exactly which lock tokens need to be supplied) (2) when they do supply all the correct lock tokens, the server applies them to resources that are not locked with those lock tokens, and the request fails because conditions must be met. Clients find it difficult to know what resources the server considers "are affected by" a request. Much of this is due to the ambiguous sentence in section 9.4.1, "If a method, due to the presence of a Depth or Destination header, is applied to multiple resources then the No-tag-list production MUST be applied to each resource the method is applied to." Examples of (1): (I'm not asking for answers, I'm asking the questions clients must ask themselves) - Basically, what defines "resources the method is applied to"? What resource is a DELETE request applied to? - When you MOVE a resource into a locked collection, do you have to supply the target collection's lock token? - When you MOVE a resource from a locked collection, do you have to supply the source collection's lock token? - When you COPY a resource from one locked collection to another, which lock token do you have to supply, or both? - What happens when a collection is locked with depth 0? What operations are affected? How is this handled differently than a depth infinity lock on the same collection? (Hint: depends on the server implementation!) Examples of (2): - When you DELETE a resource that's in a locked collection, can you simply send the IF header with an untagged lock token? No, on *some* servers the token must be tagged with the parent collection (the root of the lock). - Any MOVE request where locks are involved cannot use untagged lock tokens. - Some servers allow a lock token to be tagged with any resource-URL that the lock scope includes. Other servers require that the lock token be tagged with the URL for the resource at the root of the lock. Basically, a careful client eventually learns to always tag every lock token. This results in a very large If header value already. B. Why does the current situation guarantee additional roundtrips, but not additional security? A careful client (one that has learned the lessons from section A) that is asked to do a write operation will attempt to include a tagged (with the root-of-lock URL) lock token for every lock that could possibly be considered to "affect" the operation. So far so good -- as long as those locks still exist, the request should succeed. However, if any of those locks has been timed out or removed, the request will fail because that lock token is no longer valid. Now the client must decide whether they want to try again to make the request succeed, or not. However, it's not necessarily clear from the response which condition failed. It's possible for the server to respond simply with "precondition failed" and give no information which one failed. An error to the user is typically undesirable. Therefore, if the client is a reasonably sophisticated client, they will typically try again. Even if the user is consulted, the user will typically try again. The client will send the write operation again without the lock token. A good (well-behaved) client doing a write operation that will overwrite a regular resource should provide the ETAG to make sure that the resource hasn't changed since the GET operation. However, there's no guarantee that a client MUST do that; a client can overwrite the resource without checking the ETAG, simply by retrying the original write request without the lock token that failed. Thus, the current design requires an extra round-trip whenever the client misunderstood the lock situation, but it does not guarantee any additional protection because a poorly-behaved client can simply reissue the request without the offending token and overwrite the resource. Note in the case of large PUT requests the roundtrip may be particularly time-consuming and resource-consuming. C. Why does the current situation simply annoy a well-behaved client, without particularly helping it? A well-behaved client should always provide the ETAG when overwriting a regular resource. This is a good idea if the resource is locked or not. When a resource is locked, and the client expects it has remained locked, it may still be appropriate to overwrite the resource -- providing the ETAG has not changed. It doesn't particularly matter if the lock has gone away unexpectedly, as long as the resource is still the same resource. Thus, a PUT (or other write) request with an ETAG condition and a lock token should be able to succeed if the ETAG is correct and the lock token matches the current lock, BUT ALSO the request should succeed if the ETAG is correct and the lock no longer exists. Since the well-behaved client should be using ETAGs anyway, the failure of the request when the lock has gone away provides no help and instead is simply annoying. D. How could this situation be solved with the existing header, and why is that solution poor? Somebody suggested that the client can provide a IF header clause that is guaranteed to succeed, but will also contain the lock token. Here's how it would work: IF URL-A matches (lock-token-A OR NOT lock-token-A) This header contains the lock token, so the request would succeed if the resource is locked. However, it has an OR clause, so the request would succeed also if the resource is not locked. PROBLEM #1: Servers may not support the OR, and the NOT, correctly, because most clients don't currently use this. This solution HAS NOT been proven, it is only theoretical. It might not work. PROBLEM #2: What if multiple locks are required (e.g. moving a collection that has multiple locked resources? What if the URLs are long? The IF header becomes very long and may be truncated by some proxies. E.g. If: <http://www.ics.uci.edu/users/f/fielding/index.html> (<opaquelocktoken:f81d4fae-7dec-11d0-a765-00a0c91e6bf6> NOT <opaquelocktoken:f81d4fae-7dec-11d0-a765-00a0c91e6bf6>) <http:// www.ics.uci.edu/users/f/fielding/anotherfile.html> (<opaquelocktoken:f81d4fae-7dec-11d0-a765-00a0c91e6bf7> NOT <opaquelocktoken:f81d4fae-7dec-11d0-a765-00a0c91e6bf7>) <http:// www.ics.uci.edu/users/f/fielding/thirdfile.html> (<opaquelocktoken:f81d4fae-7dec-11d0-a765-00a0c91e6bf8> NOT <opaquelocktoken:f81d4fae-7dec-11d0-a765-00a0c91e6bf8>) PROBLEM #3: Uck, this is really complicated. While it's useful to know that a client can do this with some existing servers, provided they handle it correctly, surely we can do this in a simpler mechanism that is less prone to interoperability problems. A simple header that allows the client to supply lock tokens to use is semantically equivalent to the above example, but shorter, simpler, and easier to implement. Also it can be split across multiple lines. Use-Lock-Tokens: <opaquelocktoken:f81d4fae-7dec-11d0-a765-00a0c91e6bf6>, <opaquelocktoken:f81d4fae-7dec-11d0-a765-00a0c91e6bf7> Use-Lock-Tokens: <opaquelocktoken:f81d4fae-7dec-11d0-a765-00a0c91e6bf8> E. How could this work better? Servers could do a much better job of helping clients use write operations and locks, rather than get in the way. Some ideas: 1. Allow write operations to succeed if the correct lock token is provided in a "use these lock tokens" (comma-separated) header. The client would be able to make the write operation 2. Tell clients exactly what lock tokens are no longer valid. 3. Tell clients exactly what resources are locked, when the client asks for a write operation that affects resources that are locked without providing the lock token. Much of this can be (and needs to be) better specified within the existing framework. However, it's my opinion the existing framework can be simplified, and that simplification would provide greater interoperability. F. Why is simplicity so important? When a request/response protocol is needlessly complex, client implementers have several problems: 1. First of all, complexity is just hard for the client to implement, let alone implement correctly. 2. Not only must the client deal with complexity, it must explain things to users. WebDAV clients have to make complex things simple in order to handle the typical user of a productivity or browser application. How is the client supposed to explain that a lock that may have been there before may not be there now, and that it may or may not be OK to overwrite the resource anyway? It's not acceptable to explain that to most users. 3. Servers may implement the complex stuff differently. It becomes increasingly difficult to handle multiple server implementations and their vagaries, as the complexity goes up. 4. Clients have so much else to worry about (usability, backward compatibility, installation, multiple OS, GUIs) it's no surprise to find that the protocol implementation team in a product like Office is a tiny part of the team (or maybe even a library provided by a separate tiny team). The protocol is just another minor tool, not the main value of an application like Word or Photoshop. The client must be able to take the protocol working correctly for granted. WebDAV does a good job in most cases of allowing clients to deal with simple stuff first. E.g. it's quite possible for a client to implement only some methods and not others. That's a great benefit: it allows a client to spread an extended WebDAV-support effort over several releases. However, since other clients can lock resources, all clients that do write operations must deal with much of the complexity of handling locks. Now that WebDAV has many server implementations which are fairly reliable, fully featured and interoperable, the continued and growing success of WebDAV depends on good, usable and ubiquitous client implementations. If we wish large vendors with existing productivity applications to support WebDAV, including locks of regular resources and collections, then it would be a great benefit to simplify the things that have been found to be overly complex. To me, that's worthwhile. Lisa
Received on Wednesday, 2 October 2002 12:33:56 UTC