Destination header syntax (Bug 211) and how WebDAV works with reverse proxies

Hi,

we've got an open issue with the format of the Destination header (see 
<http://ietf.cse.ucsc.edu:8080/bugzilla/show_bug.cgi?id=211>). 
Investigating this one showed that there's a bigger issue behind it, and 
I was asked to summarize...


1. Status Quo

In RFC2518, the Destination header is defined to take a full URL, so no 
relative references or absolute paths are allowed 
(<http://greenbytes.de/tech/webdav/rfc2518.html#HEADER_Destination>). At 
first glance, this seems to be unambiguous and no reason whatsoever for 
further discussion.

However, the problem here is that it means that client and server need 
to agree about what the HTTP authority (server + port) are. Why would 
that be a problem? The client did send the correct Host: header, after 
all, right?

The issue here are setups where a reverse proxy sits between client and 
server (such as with 
<http://httpd.apache.org/docs/2.0/mod/mod_proxy.html>). In a reverse 
proxy setup, the client addresses the reverse proxy, not the origin 
server, and it's up to the reverse proxy to rewrite and forward the 
request to the end point. One use case for this would be a web site that 
uses Apache on port 80 as a base server, but also uses this to delegate 
calls to certain URLs to a servlet engine running on a different port 
(or machine).

In cases like these, clients and server do not agree upon what the 
actual end point is. For plain HTTP, this usually is not a problem, 
unless the server constructs URLs on it's own and sends them back inside 
content (in which case there are additional modules trying to rewrite 
content).

Regarding the Host request header, the reverse proxy generally has two 
choices. Leaving it as is (lying to the origin server), or rewriting it 
(putting in the actual address of the origin server). The latter seems 
to be the default both in Apache as in IIS scenarios.

This isn't a problem, unless there is *other* information in the request 
that also identifies the origin server, such as the Destination header. 
Consider

   MOVE /a HTTP/1.1
   Host: www.example.com
   Destination: http://www.example.com/b

sent to a reverse proxy running on port 80 of www.example.com, which in 
turn forwards the request to port 8080, rewriting the host header, but 
not touching the Destination header:

   MOVE /a HTTP/1.1
   Host: www.example.com:8080
   Destination: http://www.example.com/b

A compliant WebDAV server will detect that the target of MOVE is on a 
different server, and fail the request.

So if this is a real-world problem, why haven't people been complaining 
more about it?

- The most widely deployed WebDAV servers (Apache/moddav and IIS 5.5) in 
fact are not compliant. They just ignore the authority part in the 
destination header, and happily execute the MOVE operation (see 
<http://ietf.cse.ucsc.edu:8080/bugzilla/show_bug.cgi?id=211#c3>).

- The most widely used WebDAV client (MS Webfolders) in fact has 
fallback behaviour: when the server returns a 502 status, it silently 
GETs the content from the source URL and PUTs it to the Destination (but 
watch out for lost WebDAV properties; it doesn't do PROPFIND/PROPPATCH 
as far as I can tell).



2. RFC2518bis history

So at some point, this working group concluded that it would be good if 
clients would be able to stick just absolute paths into the Destination 
header, and the problem would go away, see 
<http://lists.w3.org/Archives/Public/w3c-dist-auth/2004JulSep/0025.html>. 
  But this wouldn't really have been sufficient, as there are other 
places where we require full URLs in a request, such as in tagged lists 
in the If header. After short discussion, the change was backed out 
fearing that existing servers would become non-compliant. This history 
still reflects in awkward language in 
<http://greenbytes.de/tech/webdav/draft-ietf-webdav-rfc2518bis-10.html#destination-header>.


3. What now?

I think the fact that both Apache and IIS ignore the spec is a clear 
indicator that we need to fix something. Servers just ignoring part of 
the Destination URL because clients may get them wrong, or because they 
may not pass properly through reverse proxies, certainly is a bad thing.

As RFC2518bis already requires changes in servers, I think asking for a 
few more changes should be ok. In particular, requiring servers to 
accept absolute paths in requests seems to be an easy-to-implement and 
backwards compatible change (before, servers would reject these request 
with 4xx).

This would include:

a) the destination header and
b) coded-URLs in Tagged lists in the If header

At a minimum, we should *allow* servers to accept these. Smart clients 
could then fall back to sending just the paths in case the request using 
full URLs would fail.

We also should try to get conformance tests for proper evaluation of the 
Destination header into Neon (the Apache problem already has been 
reported (as <http://issues.eu.apache.org/bugzilla/show_bug.cgi?id=38182>).

Alternatively, we could decide that this problem isn't big enough to 
warrant such a late spec change (in which case I'd recommend to move 
this into a future activity, making WebDAV more robust for setups like 
these).

Feedback appreciated,

Julian

Received on Saturday, 21 January 2006 20:43:12 UTC