RE: DAV-Enabled field (was RE: A case for GETSRC) from Julian Reschke on 2002-03-04 (w3c-dist-auth@w3.org from January to March 2002)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Mon, 4 Mar 2002 23:27:34 +0100
To: "Eric Sedlar" <Eric.Sedlar@oracle.com>, <w3c-dist-auth@w3.org>
Message-ID: <JIEGINCHMLABHJBIGKBCCEGNECAA.julian.reschke@gmx.de>
Eric,

thanks for taking the time to write this long reply. I am putting some
comments inline...

> From: Eric Sedlar [mailto:Eric.Sedlar@oracle.com]
> Sent: Monday, March 04, 2002 8:48 PM
> To: Julian Reschke; CJ Holmes; w3c-dist-auth@w3.org
> Subject: RE: DAV-Enabled field (was RE: A case for GETSRC)
>
>
> It's not clear to me that the generated output from GET on executables
> actually is a different resource.  First of all, it doesn't make sense to
> think of it as having properties separate from that of the
> script.  I doubt

It does. For instance, DAV:getcontenttype will almost certainly have a
different value, so will DAV:getcontentlength.

> if anyone actually doing the URL space mapping proposed would
> keep around a
> set of dead properties for the generated output, but the case becomes
> clearer when looking at the live properties.

Well, I wouldn't expect a server to store dead properties for a "output"
resource, but this is really an implementation detail.

> If I have a directory full of .ASPs, or .JSP files, and I do a PROPFIND on
> that directory, will the server actually execute all of the scripts in the
> directory in order, dumping their output to a temp file, to ensure that
> dav:getcontentlength is correct?  I bet nobody actually does that

A conforming server will either report these properties as empty or as
non-existant (unless it's actually able to compute these values cheaply).

> (it would
> be "considered harmful" like PROPFIND depth infinity), and if you did it
> would be stupid.  I haven't checked this, but I bet that most
> servers return
> the length of the underlying script as the length of the
> generated output on

IMHO this would be a bug.

> a PROPFIND depth 1.  The same difficulties apply to <get-etag> and many
> other live properties.  The reason this doesn't matter in practice is that
> returning the wrong etag or wrong content length for generated output is
> irrelevant, since in the general case it cannot be cached, and it works

Why do you say that generated output can not be cached? A big part of the
HTTP spec treats this very problem.

> functionally just as if the resource was changed before the
> subsequent GET,
> so it's ok if the value of <getcontentlength> is different in a
> PROPFIND on
> its parent and the actual content length returned when you do a GET.
>
> I think that the better model is if you think of the GET method more like
> the "OPEN" action on Windows or Macintosh, which means to execute
> this file
> if executable, or else display the contents of the document.  GET
> is really
> like a simple case of the POST method (which was originally how it was
> conceived).  To illustrate this point, let's say that a particular server
> said that POST on a non-executable resource (like a static file)
> would dump
> its contents (like a GET).  Now we can clearly see GET on a script as a
> simple case of POST.

GET doesn't dump "the content" of a resource. GET returns an entity which is
one of many possible representations of the resource, which may depend on
non-URL based information in the headers (in which case these headers should
be listed in the "Vary" header).

> Julian's argument below gets at the heart of the problem--is each set of
> possible output generated by a script a different resource?  First of all,
> there are many different things that can affect the output of a resource,
> and most of them are not in the URL (which is the only thing that can be
> used to differentiate two resources).  While the URL arguments in
> the query
> string are one set of arguments to the script, clearly information in the
> header may also be used (even in a GET method).  For example, the

Yes.

> authentication headers may be used to choose whose shopping cart will be
> displayed, or in POST, the form values appear in headers.  If, as Julian
> suggests: GET /foo/my.jsp?arg1=val1  is a different resource than GET
> /foo/my.jsp?arg1=val2, then POST /foo/my.jsp with a header Arg1:

GET ... is not a resource, it's one representation of a resource.

URLs (as they are URIs as defined in RFC2396) identify resources. If they
differ, this is a good indication that the resource they identify differs as
well.

> val1 should
> be different than POST /foo/my.jsp with header Arg1: val2.  This obviously
> makes no sense.

I don't understand the analogy to POST. POST is very different from GET in
it's semantics (it's not just a distinction in marshalling).

> So, here are my arguments for why each possible output generated
> by a script
> should not be a separate resource:

I didn't say that. What I'm saying is that the set of outputs (that can be
generated by a GET on a URL) is a different resource than the source
resource.

> * they may not have separate URLs, since not all of the arguments
> that cause
> different output may be in the URL string--they may be in headers, or may
> depend on other server state

That's fine. So all of them are the same abstract resource.

> * it is impractical to manage dead properties for each possible output

That's no problem. If you don't want to, don't.

> * it is computationally difficult to return dead properties (such as
> dav:getcontentlength) for each possible output

Again fine. Don't return them.

> Therefore, I conclude that generated output is not a resource, and

You can't conclude that, because the generated output *has* a URL, so it
*must* be a representation of an abstract resource.

> especially not a dav-enabled resource.  I would recommend that server
> implementers NOT return script output as dav enabled.

That's fine. You don't have to.

> Now, on to the practical argument, which will demonstrate that
> dav:source is
> a lot less usable than GETSRC/Translate:
>
> let's say that I have a directory full of scripts and static
> files.  I want
> to manage them from a authoring point of view.  If I want to implement
> dav:source, I need to create some virtual URL space that
> corresponds to the
> physical URL space where the actual script source is stored.  I

Actually, I'd say that you should create mappings for the outputs, not for
the sources, but this is just a matter of taste, I guess.

> can make the
> virtual URL space return either the generated output, or the source, at
> least from an implementation point of view.  For example:
>
> Contents of /mydir:
>
>    foo.jsp
>    bar.html
>    bar.gif
>
> In case 1, let's say I make /scripts/src/<real-path> a virtual URL space
> that returns the source of any executables (for DAV clients), so where GET
> /mydir/foo.jsp will return the generated output,
> /scripts/src/mydir/foo.jsp
> will return the source.  dav:source on /mydir/foo.jsp will point to
> /scripts/src/mydir/foo.jsp.  The DAV client doesn't actually want to mess
> with /mydir/foo.jsp since it is generated, has bogus live property values,
> doesn't respond to things like PUT, and is generally not very
> helpful.  Now
> I'm forced into a separate tree, which is certainly annoying from a

That's something the client should do for you.

> useability point of view, and extremely annoying when doing things like
> copying with Web Folders, since when I drag & drop my app from the local
> filesystem to the server, my .jsp files move to another place, and when I
> copy the application back down to my local PC, the .jsps were
> moved (even if
> I was smart enough to realize where they magically went to).  So,
> this is a
> bad solution.

Why don't you just edit *all* files in the source directory and leave the
output resources alone?

> In case 2, let's say I make /script/output/<real-path> a virtual URL space
> that returns the generated output for the executables.  Now the DAV client
> is happy, since the server space looks like the space on the client.
> Unfortunately, all of the <href> tags in any HTML output are
> screwed up vs.
> the local copy (e.g if bar.html wanted a link to foo.jsp, or vice versa),
> since neither a relative nor an absolute URL will work, and the
> URLs needed
> are server implementation-dependent.
>
> The reality of the world is that people want to be able to move their
> content creation environment back and forth between local filesystems and
> DAV servers.  Any usage of dav:source would break that abstraction, since
> local filesystems aren't capable of implementing virtual URL spaces.  I

Again, that's not true. You can author the *source* URL space just like
always. All you need the DAV:source property for is to discover where those
files sit.

> think these are the practical reasons why nobody has implemented
> dav:source
> in the past 3 years.

The DAV:source property is underspecified in the spec, that's the reason
it's not implemented. That doesn't make the concept itself bad.

> So, it is clear that:
>
> * generated output (from stuff like .jsps, .asps, cgi) does not make sense
> as a DAV-enabled resource, since the only methods it is likely to
> respond to
> intelligently is GET.

I don't have any problem with this. If output resources aren't WEBDAV
enabled, you just lose the ability to use DAV:source to discover their
sources.

> * separating the URL space of generated output and source causes usability
> problems, making life awkward either for the DAV client, or for the HTML
> linking

I don't agree.

> I think the argument for a GETSRC method becomes clear if you mentally
> rename GET to LIMITED-VERSION-OF-POST.

GET and POST are fundamentelly different things.

> Reponses to a couple of other points in more recent emails:
>
>   Alan Babich says:
>
> "Is the source the same entity as its output?" Julian and I agree
> that it is
> not. RFC2616 section 9.3 says it is not.
>
>   Response:  the issue is not whether or not the source and the output are
> the same "entity", but the same DAV resource.  Entity and resource are

...the same HTTP resource...

> different things.  There are three general categories of things that can

Yes.

> cause the output of a generated script to vary:  URL information, e.g. the
> query string; header information; or server-state (like in a database).
> Clearly the URL isn't sufficient to differentiate between all the kinds of
> server-generated output, so the various entities that can be
> generated by a
> script cannot be mapped as separate resources, since they won't have
> separate URLs.

If they don't have separate URLs, they aren't different abstract resources.
In the case of the shopping cart mentioned earlier, the resource identified
by the URL is "the shopping cart of the currently logged in user". It's no
problem if it's different for different users.

>   Roy Fielding says:
>
> Any server that allows clients to vary GET responses based on the
> Translate
> header field MUST include "Vary: Translate" in every response to the GET
> method in order to remain compliant with HTTP/1.1.
>
>   Response:  Actually, RFC2616 says: "An HTTP/1.1 server SHOULD include a
> Vary header field with any cacheable response that is subject to
> server-driven negotiation."  So, clearly, Microsoft is in compliance with
> the HTTP specification since this is not a MUST requirement.

SHOULD (as defined in RFC2119):

"This word, or the adjective "RECOMMENDED", mean that there
   may exist valid reasons in particular circumstances to ignore a
   particular item, but the full implications must be understood and
   carefully weighed before choosing a different course."

To still claim conformance, one would have to specify these valid reasons.
Can you think of any?

> Clearly it is
> acceptable to use header values to vary the content of generated state,

Yes, but by doing to you state that all entities that you return are
representations of the same resource. Are they?

> whether it be the Translate header or the Authentication header,
> and whether
> or not the Vary header is supplied is a completely orthogonal issue.

Yes. But you SHOULD indicate it in the "Vary" header.

> Here's another way to look at the categorization of response entities for
> executables:
>
> 1) the source
> 2) generated output that does not vary based on input arguments
> from client
> 3) generated output varying with input arguments from client encoded in
> query string
> 4) generated output varying with input arguments from client encoded in
> request headers
> 5) generated output varying with server-side state (like information in a
> database)
>
> I don't see why case #2 and case #3 (in Julian's example) should
> be awarded
> its own status as a "real dav resource" when cases #4 and #5 are not.

Because resources are *identified* by their URLs. If the URL is the same,
it's *by definition* the same resource.

> I'd like to see someone propose a scheme for implementing dav:source that
> solves the usability problem when mapping to the local filesystem, if they
> believe in dav:source.

First of all, I strongly object to this line of argument. Even *if* you can
demonstrate that DAV:source as it is doesn't work, this doesn't mean that
this makes the Translate header an acceptable solution.

This having said, I don't see any problem having two separate URL spaces,
and to only author the source space. The output space *may* be DAV enabled,
and if it is, it should use DAV:source or something similar we'll come up
with to indicate where the source is.

Julian
Received on Monday, 4 March 2002 17:28:22 UTC