Re: Ideas: GETSRC & MULTIPUT from Jim Whitehead on 2001-10-31 (w3c-dist-auth@w3.org from October to December 2001)

From: Jim Whitehead <ejw@cse.ucsc.edu>
Date: Wed, 31 Oct 2001 11:53:41 -0800
To: "WebDAV" <w3c-dist-auth@w3.org>
Message-ID: <AMEPKEBLDJJCCDEJHAMIIEHCDKAA.ejw@cse.ucsc.edu>
Accidentally caught by the spam filter. I have added "Kantz@wegalink.de" to
the accept2 list, so future emails from this address will go straight
through.

- Jim

-----Original Message-----
From: Eckhard Kantz [mailto:Kantz@wegalink.de]
Sent: Wednesday, October 31, 2001 5:15 AM
To: WebDAV
Subject: [Moderator Action] Re: Ideas: GETSRC & MULTIPUT


...sorry for the length of my comment:

RFC2518 is most likely the biggest step towards balancing the data transfer
between an Internet client and server that went mainly from a server to
clients since HTTP was introduced. This is a great progress. There seems to
be missing only the tiny detail of how to get access to the source of
synthetic web pages in order to satisfy all needs that exist in this
context. The more and the longer I try to understand the basic problem
behind this discussion the more I came to the
conclusion that this can not be the full truth and that it might by
necessary to look at it from a higher position, maybe the well known 60.000
feet view.

Then it becomes obvious that each URL is actually understood as a source of
information, as a unique resource, finally as a unique resource locator that
is used to retrive all those information. In this sense all worked fine in
the past since a server had full control over the information output of a
request against a URL in the server's name space. The only impact that a
client had on the presentation of the output was at most to specify some
parameters that the server could use or not use.

This has changed dramatically with WebDAV. Now the client is said to be the
supervisor. The client should be able to specify exactly what the server
returns as response to a request. The basic expectation is symmetry, that is
at least all what can be PUT to the server can also be GET from the server.
Unfortunately, there is only this one URL that exists for a given collection
(!!) of information which is usually presented to an end user by a browser
application. The question is how to control the different output on protocol
level. Actually, what output concepts are needed?

(1) The conventional GET output, which is nowadays most often dynamic
(2) The one script that is responsible for generating the dynamic output in
case this is the only source file
(3) A collection of multiple source components (script, database cells,
maybe also included pictures, audio, video,...)

Additionally one could think of the following:
(4) Generate output in a certain format that the client would like to work
on, (e.g. html, doc, pdf, txt, ...) in case the server is capable of
presenting the information in that format based on preprocessed files or
based on converting files on-the-fly
(5) Generate output dependent on the capabilities of the presentation device
(screen 1024x768, screen 320x200, color/black-white, WAP,...)
(6) Generate output dependent on time in order to get past presentations of
a web page (e.g. stock market, sports results, weather forcast...) what
should not be a major problem for any server having those past data normally
stored in its database over a certain period of time

It seems that all those wishes could be fulfilled immediately if a given URL
could definitely be opened as a collection rather than a file. In this case
dependent on what the server is capable of delivering to clients the output
could be as follows:

GET /foo/bar -> would return dynamic output as usual

<get-as-collection> /foo/bar -> would show all components that the server is
willing to expose to
clients, e.g.

/foo/bar/jsp-script  -> the java server page script that generates /foo/bar
/foo/bar/doc -> the content of /foo/bar as a MSWord document
/foo/bar/pdf -> the content of /foo/bar as a PDF document
/foo/bar/database/field1 -> the content of the database field 1 that is used
to generate /foo/bar dynamically
/foo/bar/database/field2 -> the content of the database field 2 that is used
to generate /foo/bar dynamically
/foo/bar/logo.jpg -> the logo.jpg picture that is contained in the /foo/bar
page
/foo/bar/welcome.wav -> the sound file that is played when the page is
opened in a browser /foo/bar/wap -> the content of /foo/bar but customized
to the capabilities of a WAP display
/foo/bar/2001-10-30 -> the content of /foo/bar but generated with past data
from October, 30th
/foo/bar/2001-10-30/13:05 -> the content of /foo/bar but generated with past
data from October, 30th at 1:05 p.m.
/foo/bar/2001-10-30/13:05/pdf -> the content of /foo/bar but generated with
past data from October, 30th at 1:05 p.m. and converted to a PDF file

This all would be feasible with current 2518 if there was a possibility to
open a URL explicitly as a collection rather than as a file. In this case
all other methods like PUT, PROPPATCH, .. would work as usual since there
are separate URLs available for each component of a dynamically generated
HTML page. Maybe that some of those components do not make sense to PUT them
back to the server, e.g. historical data or an output format that was
generated on-the-fly.

Fortunately 2518 offers the PROPFIND method to obtain all resource names of
a collection as well as their properties. The only thing to do was to let
servers expose dynamically generated URLs for all components of a URL during
a PROPFIND with depth=1 and to handle PUT and PROPPATCH against those
dynamic URLs.

Actually, I am not sure if this is the best solution one could think of. It
should not be overseen that the URL namespace is double used. However, since
the server controls the namespace from its root downwards anyway this might
be acceptable. On the other hand there might be better proposals that
provide to the desired goal of reading and writing components of a web
resource in a specifiable format.

Eckhard
Received on Wednesday, 31 October 2001 14:58:26 UTC