A case for GETSRC (or other mechanism to that effect)

I've been reading the archives, and following the long-running and 
sometimes contentious conversation over retrieval of source 
documents.  Maybe bringing this up will get me kicked off the list, 
but we'll see.

First, I appreciate what the DAV working group is trying to do by 
allowing multiple sources for a given document, and making each 
source a separate resource.  I won't bother denying that it is useful 
to some people and interesting in its own right.  I totally see the 
logic in it.  But it is an abstraction that is useful only to a small 
minority of users, and is very difficult to implement.  If you doubt 
that, I direct you to the total lack of implementations.  But I'll 
make a case on other grounds, as well.

With any DAV request except GET the process is rather straightforward:

	A virtual host is selected

	Security determines what rights the user has on the URL.  In 
the Apache universe that means PUT, POST, MKCOL, GET, etc..  In our 
universe (WebSTAR) that means read, write, list, mkdir, etc..

	We figure out what file the URL maps to.

	In deciding which plug-in will handle the request, a DAV 
plug-in simply claims anything with a DAV-specific method.  If it is 
MKCOL, DELETE, etc. then the DAV plug-in claims it.

	DAV handles the request, within the limits set by the security plug-in.

All of this works great, until we get to the GET method.  Plug-ins 
have no way of knowing the difference between a normal GET that 
should be handled normally, and a GET that is intended to retrieve 
the source of a document and requires special permissions.  In effect 
we have TWO semantics for GET.  And so we start creating fake URL 
spaces.  But we quickly run into security and configuration hassles.

The bottom line is that there's no way to provide good DAV access 
without some special configuration on the server.  And it isn't the 
result of some kind of design error by the implementor, but a basic 
issue with how the protocol is designed.  Unless you are only dealing 
with static content you MUST set up a whole new URL space to do DAV 
at all.

In practice setting this up is such a bother that what most 
administrators do is create a whole new virtual host with all 
plug-ins disabled in order to get it done.  Is it just me, or is it 
entirely ridiculous that server administrators have to double the 
number of virtual hosts they support just to give decent DAV access 
to their users?

And all of this is so complicated for something that *should* be 
really simple!  From most servers' point of view DAV is a way to let 
web authors access their .php, .cgi, .ssi, .shtml, and other dynamic 
files and edit them and save them back onto the server.  All of this 
business about an arbitrary number of sources of arbitrary types as 
properties of a single resource don't do our users one bit of good, 
but the implementation is quite burdensome.

What we really need from the DAV protocol is a way to know that a 
given request is intended to retrieve the source of a document, 
without having to also posses knowledge about the DAV configuration, 
or which URL spaces have been quarantined for use exclusively for 
WebDAV, and without having to create special areas in the URL space 
that don't actually map to anything else in the universe just so we 
can turn off all the plug-ins for a few requests.  Thus the 
suggestion for a GETSRC property.

I've read the arguments that ask about FINDPROP and other methods 
that act on a resource, and having to double the number of methods to 
support FINDPROP and FINDPROPSRC, etc..  But in most server 
implementations when you get the properties of a .php or .ssi file 
right now, what are you getting?  In practice you're getting the 
author and lastmod date and other information about the SOURCE, not 
the output.  I don't know of anybody who is executing a .php file and 
then returning the properties of the output -- it (a) isn't practical 
because of all the side-effects it would cause like CPU load and 
errors and code running that you didn't really intend to run and (b) 
isn't useful because nobody cares one whit about the DAV properties 
of PHP output.

I suggest that we just stop considering GET to be a DAV method at 
all.  GET has nothing to do with DAV any more than POST does. 
Instead of GET, we should be using GETSRC (or whatever you want to 
call it), always.  If you want the output of a source which has been 
run through PHP or SSI or whatever, then fire up your web browser and 
use a GET or POST method.  Otherwise, use your DAV client and GETSRC 
(or whatever).

When you work with the properties of a resource, you are working with 
the properties of the resource you receive from a GETSRC command. 
When you GETSRC, you receive the same data that you PUT.  When you 
GET or POST, you receive what the server generates from the source 
and your input.  In some cases that happens to be the same as the 
source (static files).


If OPTIONS does not indicate that GETSRC is avialable, then a DAV 
client may use GET instead.

Does not preclude a document from having multiple sources at 
different URIs, for the 5% of implementations that need such 
functionality.

Advantages of this approach:

	Works well with existing security modules for Apache and 
other servers.  Just add GETSRC to the list of methods a user is 
allowed to use.

	Provides one semantic for GET (the existing semantic from 
HTTP/1.1 spec) and only one semantic.

	Is far more likely to be implemented, and implemented WIDELY, 
than the src property ever will be.


This is a major, major problem with the protocol.  This is in not an 
indictment of anybody's abilities.  Even as a prospective implementor 
reading the spec carefully I didn't appreciate the magnitude of the 
problem until I set out to really implement it, and then it became 
nearly intractable.  Even the reference implementation doesn't 
implement it.

The DAV group can take this issue in hand and come up with a solution 
people are willing to implement, or the ad-hoc solutions will become 
de-facto standards.  (ie: Translate).  Personally, I would rather see 
this solved by the DAV working group.


cjh

-- 

Received on Thursday, 28 February 2002 01:03:03 UTC