RE: DAV-Enabled field (was RE: A case for GETSRC) from Julian Reschke on 2002-03-06 (w3c-dist-auth@w3.org from January to March 2002)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Wed, 6 Mar 2002 13:46:45 +0100
To: "Eric Sedlar" <eric.sedlar@oracle.com>, <w3c-dist-auth@w3.org>
Message-ID: <JIEGINCHMLABHJBIGKBCCEKEECAA.julian.reschke@gmx.de>
Eric,

comments inline....

> > A conforming server will either report these properties as empty or as
> > non-existant (unless it's actually able to compute these values
> cheaply).
> >
>
> The point is that you will not be able to get <dav:getcontentlength> and
> <dav:getetag> that have the correct values, which the spec pretty clearly
> states should be returned if it is a DAV-enabled resource.

Sorry? If your output is dynamically generated, you won't be able to supply
a content-length header upon GET (or HEAD) either (unless you compute the
result first before sending the headers). So it's perfectly acceptable not
to return a DAV:getcontentlength for resources like these.

> > > (it would
> > > be "considered harmful" like PROPFIND depth infinity), and if
> you did it
> > > would be stupid.  I haven't checked this, but I bet that most
> > > servers return
> > > the length of the underlying script as the length of the
> > > generated output on
> >
> > IMHO this would be a bug.
> >
>
> Whatever.  They return something other than the correct value.

Eric, not returning a value if you don't have it (and won't have it upon
GET) is correct.

> > > a PROPFIND depth 1.  The same difficulties apply to
> <get-etag> and many
> > > other live properties.  The reason this doesn't matter in practice is
> that
> > > returning the wrong etag or wrong content length for
> generated output is
> > > irrelevant, since in the general case it cannot be cached,
> and it works
> >
> > Why do you say that generated output can not be cached? A big
> part of the
> > HTTP spec treats this very problem.
>
> I say that in the GENERAL case it cannot be cached.  For example, a script
> that returns the value of a row in a database table (say the current
> inventory total on Amazon.com of Harry Potter books) cannot be
> cached, since
> the return value is not simply a function of the input arguments in the
> request method.  Those generated output entities that are
> strictly functions
> of the input arguments can be cached, by using the techniques in HTTP/1.1.

Others can be cached as well. For instance, a GET on http://www.cnn.com
(clearly something that falls under the category you just mentioned) tells
me that I can cache the resullt for 60 seconds:

> wget -S www.cnn.com
--13:07:43--  http://www.cnn.com/
           => `index.html'
Connecting to www.cnn.com:80... connected!
HTTP request sent, awaiting response... 200 OK
2 Server: Netscape-Enterprise/4.1
3 Date: Wed, 06 Mar 2002 12:07:42 GMT
4 Last-modified: Wed, 06 Mar 2002 12:07:43 GMT
5 Expires: Wed, 06 Mar 2002 12:08:43 GMT
6 Cache-control: private,max-age=60
7 Content-type: text/html
8 Connection: close

> > I don't understand the analogy to POST. POST is very different
> from GET in
> > it's semantics (it's not just a distinction in marshalling).
> >
>
> Why is POST different in semantics from GET?  Since we are confining
> ourselves to discussing generated output, I see them as just having
> different ways of marshalling arguments.

GET gets a representation of the resource (without affecting it). POST
(possibly) modifies the resource.

> > > can make the
> > > virtual URL space return either the generated output, or the
> source, at
> > > least from an implementation point of view.  For example:
> > >
> > > Contents of /mydir:
> > >
> > >    foo.jsp
> > >    bar.html
> > >    bar.gif
> > >
> > > In case 1, let's say I make /scripts/src/<real-path> a
> virtual URL space
> > > that returns the source of any executables (for DAV clients), so where
> GET
> > > /mydir/foo.jsp will return the generated output,
> > > /scripts/src/mydir/foo.jsp
> > > will return the source.  dav:source on /mydir/foo.jsp will point to
> > > /scripts/src/mydir/foo.jsp.  The DAV client doesn't actually want to
> mess
> > > with /mydir/foo.jsp since it is generated, has bogus live property
> values,
> > > doesn't respond to things like PUT, and is generally not very
> > > helpful.  Now
> > > I'm forced into a separate tree, which is certainly annoying from a
> >
> > That's something the client should do for you.
> >
>
> It can't do this easily, because it doesn't know the general rule
> of mapping
> the generated output URL to the source, it just knows the answer
> on a URL by
> URL basis.

Maybe this indicates that the protocol should allow a server to say that all
source resources for the output resources in a collection sit in some
specific other collection. I admit that if this is the case and the server
indicates this, thing for clients would be much easier.

> > > In case 2, let's say I make /script/output/<real-path> a virtual URL
> space
> > > that returns the generated output for the executables.  Now the DAV
> client
> > > is happy, since the server space looks like the space on the client.
> > > Unfortunately, all of the <href> tags in any HTML output are
> > > screwed up vs.
> > > the local copy (e.g if bar.html wanted a link to foo.jsp, or vice
> versa),
> > > since neither a relative nor an absolute URL will work, and the
> > > URLs needed
> > > are server implementation-dependent.
> > >
>
> Julian, you have suggested that case #2 is the preferred solution
> (creating
> the virtual URL space for the generated resources), but you haven't
> addressed the linking problem between the static & generated content.  You
> have not given a complete suggestion.

I'm not sure that I understand the issue. If you're talking about broken
links -- can't you use relative URI references instead?

> > > The reality of the world is that people want to be able to move their
> > > content creation environment back and forth between local filesystems
> and
> > > DAV servers.  Any usage of dav:source would break that abstraction,
> since
> > > local filesystems aren't capable of implementing virtual URL
> spaces.  I
> >
> > Again, that's not true. You can author the *source* URL space just like
> > always. All you need the DAV:source property for is to discover where
> those
> > files sit.
> >
>
> Are you suggesting that when dragging & dropping some Web Folders
> down from
> a server, the client should examine the dav:source properties,
> and copy down
> the entities returned by GET on those URLs those represent,
> rather than the
> entities returned by the original URLs?  While this could be done, it
> certainly breaks backwards compatability with almost all existing DAV
> clients.  Plus, it could result in multiple copies of the source after the
> download.

I think this is what the spec says. I would expect that the result of
COPYing is the same as GETting and PUTting to a new location. I certainly
wouldn't expect to have the source code in one case and it's output in the
other case.

> > > think these are the practical reasons why nobody has implemented
> > > dav:source
> > > in the past 3 years.
> >
> > The DAV:source property is underspecified in the spec, that's the reason
> > it's not implemented. That doesn't make the concept itself bad.
>
> What part of dav:source is underspecified?  I think it is pretty
> clear from
> the spec.  It just doesn't solve the problem in a usable way.

See
<http://lists.w3.org/Archives/Public/w3c-dist-auth/2001OctDec/0119.html>.

> > > * separating the URL space of generated output and source causes
> usability
> > > problems, making life awkward either for the DAV client, or
> for the HTML
> > > linking
> >
> > I don't agree.
> >
>
> That's because you aren't considering backwards compatability.

With...?

> > > I think the argument for a GETSRC method becomes clear if you mentally
> > > rename GET to LIMITED-VERSION-OF-POST.
> >
> > GET and POST are fundamentelly different things.
> >
>
> Why are they so fundamentally different?

Because of their definition in the HTTP protocol?

> > >   Response:  Actually, RFC2616 says: "An HTTP/1.1 server
> SHOULD include
> a
> > > Vary header field with any cacheable response that is subject to
> > > server-driven negotiation."  So, clearly, Microsoft is in compliance
> with
> > > the HTTP specification since this is not a MUST requirement.
> >
> > SHOULD (as defined in RFC2119):
> >
> > "This word, or the adjective "RECOMMENDED", mean that there
> >    may exist valid reasons in particular circumstances to ignore a
> >    particular item, but the full implications must be understood and
> >    carefully weighed before choosing a different course."
> >
> > To still claim conformance, one would have to specify these
> valid reasons.
> > Can you think of any?
> >
>
> I'm sure MSFT could come up with some valid
> backwards-compatability reasons
> or something.  The point is that an HTTP/1.1 client cannot expect anything
> with SHOULD to always be the case

Which means that implementing Translate without setting the Translate header
*will* break caching in proxies.


> > > Here's another way to look at the categorization of response entities
> for
> > > executables:
> > >
> > > 1) the source
> > > 2) generated output that does not vary based on input arguments
> > > from client
> > > 3) generated output varying with input arguments from client
> encoded in
> > > query string
> > > 4) generated output varying with input arguments from client
> encoded in
> > > request headers
> > > 5) generated output varying with server-side state (like
> information in
> a
> > > database)
> > >
> > > I don't see why case #2 and case #3 (in Julian's example) should
> > > be awarded
> > > its own status as a "real dav resource" when cases #4 and #5 are not.
> >
> > Because resources are *identified* by their URLs. If the URL is
> the same,
> > it's *by definition* the same resource.
> >
>
> Now you are arguing circularly.  You say that because the source and
> generated output are different resources, they should have different URLs.

Yes.

> Now you are saying that because they have different URLs, they
> are different
> resources.

Did I say that?

- Resources (RFC 2616):

"A network data object or service that can be identified by a URI, as
defined in section 3.2. Resources may be
available in multiple representations (e.g. multiple languages, data
formats, size, and resolutions) or vary in
other ways."

- URI (RFC 2616)

"As far as HTTP is concerned, Uniform Resource Identifiers are simply
formatted strings which identify--via name, location,
or any other characteristic--a resource."


So if two things have the same URI, they *must* be the same resource (by RFC
terminology). If the URIs differ, they *may* be different resources.

> My line of argument is that having a separate URL space for
> generated output
> >from the underlying executable doesn't work well because:
>
>   * the generated "resources" won't have the correct values for many live
> properties (other than dav:source!!!), and will probably (your words) not
> support dead properties

Not having values is something different from not having "correct" ones. I
simply don't see a problem here.

>   * the generated "resources" will only respond to the GET
> method, not other
> WebDAV methods

Define "respond". They may not *implement* or *support* some other methods,
right.

>   * creating a virtual URL space for generated outputs (as you suggest)
> makes the <href> linking unworkable when doing the 1-1 sitecopy semantics
> between local filesystems & servers that is common practice today, thus
> making the separate URL space idea unusable if backwards compatability is
> maintained.

I think it is *also* common practice to replicate the source URL spaces (and
not to touch the output URL spaces at all). In which case there's no
problem.

>   * just because the response body is different based on the Translate:
> header isn't really a different argument than having a different response
> body by varying the authentication header.  By this argument,
> each possible
> response body that can be altered by varying a header is a different
> resource, which clearly does not make sense.

I didn't say that. I asked the question: "do you consider the (ASP) source
code and it's output the same resource or not"? If they are, they can share
the same URL. If they aren't, they MUST not. HTTP (RFC2616) says they
aren't.

Varying the entity returned based on the time of day, the logged in user,
it's accept-language header and similar things is something completely
different from just returning the source code.

> dav:source is an implementation suggestion that requires separate URL
> spaces.  GETSRC/Translate are various ways of handling the
> generated output
> as the same URL.  My argument is against the separate URL space, not the
> dav:source header as specified by RFC2518, which most people seem to agree
> is broken (you believe it is underspecified, and I believe it is the wrong
> architecture).
>
> Julian, fundamentally your response is self-contradictory.  At one point,
> you say:
>
> 1) "the set of outputs (that can be generated by a GET on a URL) is a
> different resource than the source resource".

Yes.

> but later you say that:
>
> 2) "Because resources are *identified* by their URLs. If the URL is the
> same, it's *by definition* the same resource".
>
> If a set of generated outputs is generated differently by varying
> parameters
> in the query string, statement #1 would imply that the entire set of valid
> outputs is one resource.  Statement #2 implies that this set is a bunch of

No, it doesn't. If you vary parameters in the query string, you are using
*different* URLs. Statement 1 talks about the set of outputs generated by
GETs on a (*specific*) URL.

> different resources, since their URL is different.  You are using a
> different number of resources per executable at different points
> to reply to
> different points in my argument.  I'm saying there is one
> resource, used for
> the script & generated output.  At first you are proposing 2, and
> later you
> propose n, where n is the number of possible combinations of legal input
> querystring values.  Which is it?

No, I didn't.

What I'm saying is that:

- A URL identifies a resource. GETs on that URL return different
representations of the same resource.

- If the URLs differ (by varying the query string), they *may* identify
different resources (but they don't have to). For instance, I'd consider

	http://server/foo?test=1 and http://server/foo?test=2 different resources,
but

	http://server/foo?a=1&b=2 and http://server/foo?b=2&a=1 the same resource
(but to do this, you'll have to have special knowledge about the treatment
of query parameters)


> Julian's other argument is that you might want to have existing clients be
> able to edit the source just by pointing to some URL.  Well, without
> understanding dav:source they won't know what the URL is to use
> for the next
> GET method.  So, clearly clients will need to be upgraded either to handle
> Translate:/GETSRC or to handle the dav:source header.  Existing clients
> won't cut it, so I don't by this reasoning, anyway.

Yes, clients and servers will need to be upgraded. Did I say otherwise?

Julian
Received on Wednesday, 6 March 2002 07:47:19 UTC