# RE: Prelim. DAV spec.

From: Yaron Goland <yarong@microsoft.com>
Date: Wed, 30 Oct 1996 23:20:12 -0800
Message-ID: <c=US%a=_%p=msft%l=RED-44-MSG-961031072012Z-5867@mail4.microsoft.com>
To: "'Daniel W. Connolly'" <connolly@beach.w3.org>, "'Jim Whitehead'" <ejw@rome.ICS.UCI.EDU>



-----Original Message-----
From:	Daniel W. Connolly [SMTP:connolly@beach.w3.org]
Sent:	Saturday, October 26, 1996 12:16 PM
Cc:	w3c-dist-auth@w3.org
Subject:	Re: Prelim. DAV spec.

In message <9610251749.aa07659@paris.ics.uci.edu>, Jim Whitehead writes:
><draft-ietf-webdav-v1-spec-00>                         October 25, 1996

>1.2 Terminology
>
>Unless otherwise noted below, the use of terminology in this document is
>consistent with the definitions of terms given in [HTTP11].
>
>check in
>     A Check In is a declaration that the client no longer intends to edit
a
>     representation(s).

Of "entity" and "representation": pick one and stick with it.

But in this case, I think you meant resource, i.e. the gizmo associated
with a URL.

No we mean representation(s), meaning one or more than one representations
of a resource. Where representation is defined as in HTTP 1.1.

representation
An entity included with a response that is subject to content
negotiation, as described in section 12. There may exist multiple
representations associated with a particular response status.

The key here is that we are not just talking about an entity, we are
talking about a content negotiated entity. I have removed all references to
entity and replaced them with representation.

Of edit and update, pick one.

So I think it should say "... no longer intends to update a resource."

Update is a loaded term. I prefer edit. I have modified the draft to use
edit.

>history
>     The history of a URI is a list of all the versions of the URI along
>     with related information.

"Versions of a URI"? URIs are immutable. They're just strings, kinda
like large integers.

I suggest:
The history of a URI U* is a list of URIs Ui where each Ui refers
to some version of the U* resource, plus some attribues related to
each version.

This goes into the same bag as Destroy which is now defined as:
destroy
To destroy a resource is to request that the resource be permanently
removed from storage. This differs from delete in that some versioning
systems handle delete as a request to no longer make the specified resource
editable.
I have also changed history to:
history
The history of a resource is a list of all the versions of the resource
along with related information.
Of course each version is also a resource. However I don't want to define
version as it would be nothing but an academic exercise.

>merge
>     A merge is the process whereby a resource represented by one URI is
>     combined with a resource represented by a second URI. Merges can
occur
>     at the client or the server.

@@hmmm...

merge
A merge is the process whereby information from one or more resources is
used to produce a new resource that represents the content of the component
resources. Merges can occur at the client or the server.

BTW, I know you can nit pick the merge definition. Please don't. I really
don't want to spend an hour finding just the right adjective.

>no-modify lock
>     A no-modify lock prevents a locked resource from being altered until
>     all no-modify locks are released.

s/altered/updated/. Be consistent.

Fair enough, all the "alter" have been changed to "edit".

>notify request
>     A notify request instructs the recipient to send update information
>     regarding the progress of a request.

If you use update in the sense that I'm suggesting, don't use it for
this purpose as well. I suggest: s/update/status/.

notify request
A notify request instructs the recipient to send information regarding the
progress of a request.

>server diff
>     A server diff is a mechanism whereby the server compares two or more
>     representations, and sends the client a message containing a summary
of
>     the differences between the entities.

s/representations/entities/.

server diff
A server diff is a mechanism whereby the server compares two or more
representations and sends the client a message containing the differences.

>2. Attributes
>
>It is often necessary to record meta data about a resource. The natural
>place to store such meta data is in HTTP headers however this presents a
>problem. HTTP headers, as defined in [HTTP11], are used either for
content
>description or communication. While communication headers can be
retrieved
>with the Option method, content description headers can not.

Yes, but one can not specify which content headers will be transmitted.
Where we to use content headers for attributes and were we to use HEAD to
discover them, then we would have to transmit all the content headers with
each HEAD request. As the number of attributes will be very large and as
the size of each attribute is likely to also be large, this is not
acceptable.

I have changed the paragraph to read:
It is often necessary to record meta data about a resource. The natural
place to store such meta data is in HTTP headers however this presents a
problem. HTTP headers, as defined in [HTTP11], are used either for content
description or communication. Neither of these header types were designed
for remote setting and on-demand retrieval.

>neither of these headers were designed for remote setting and on-demand
>retrieval.

I disagree. Jigsaw supports setting and on-demand retrieval of

How? Does it allow me to selectively specify which header I want? Will it
let me perform a method on a header?

>What is needed is a third type of header, a header which provides
>information about the resource's nature, not its content or transmission
>state. This type of header will be referred to as an attribute header.

I suggest you merget the concept of attribute header into the

That is unfortunately not practical. The semi-practical solution to the
attributes problem is to define a new entity-header "attributes" whose
value is a URL. That URL then points to a file which contains a whole list
of attributes, each of which contains a value, which could potentially also
be a URI. If I wanted to find out what the name of the author of a document
is I would send a HEAD request, then do a GET on the attribute URI. I would
then read the response and see if the Author attribute is present. If so I
would either read the associated value or, if the value is a URI, do a GET
on the URI to get the Author information. This means that at a minimum
every request for attribute information would involve at least two HTTP
file. In all other respects this new attribute means is as powerful as the
one currently proposed. The problem is the two requests. That is too high

>2.1 Attribute Syntax
>
>To support requests for attributes the definition of a URI must be
altered
>as follows:
>
>URI = ( absoluteURI | relativeURI ) ["<" Attribute ">"] ["#" fragment]
>Attribute = field-name ; See section 4.2 of [HTTP11]

Interesting... can attributes be links as well? In a draft I'm working on:

ABSOLUTELY!!! This has been one of Jim's basic requirements since day one.
He wants N-Ary links. We have a small disagreement on syntax but we are in
firm agreement regarding semantics.

W3C Note @@Date
This version:
(unpublished)
$Id: NOTE-link.html,v 1.5 1996/10/20 20:49:49 connolly Exp$

I use the notation R(A,B) to refer to any number of similar structures:

e.g.		stylesheet(http://www.w3.org/, http://www.w3.org/houststyle.css)
e.g.		Expires(http://www.w3.org/TR/WD-xxx, "Wed Oct 19 15:34:54 1996")
attribute-name(URL, attribute-value)
e.g.		hair-color(http://www.w3.org/People/Connolly/, "brown")
two-place-predicate(arg1, arg2)
e.g.		member(http://www.w3.org/People/Connolly/,
http://www.w3.org/People/)

This URI notation combines R and A into one. So I might denote the above
examples by:
http://www.w3.org/<stylesheet> = http://www.w3.org/houststyle.css
http://www.w3.org/TR/WD-xxx<Expires> = "Wed Oct 19 15:34:54 1996"
http://www.w3.org/People/Connolly/<hair-color> = "brown"
http://www.w3.org/People/Connolly/<member> = http://www.w3.org/People/

The last example reminds me of an issue: Header fields and link
relationship are usually 1-1, but not always: the member relationship
is obviously many-many. I think HTTP header fields are multiply valued;
e.g. Accept: x,y is the same as Accept: x Accept: y. I can't think
of any multiply valued entity headers though.

This brings us to N-Ary Links. When one does a request on an attribute
header the response is in the entity-body. As such the value can be
ANYTHING. In the case of N-Ary links the response would be something like:
I say "like" because Jim and I still need to talk. The idea is that you
would have a bunch of URIs, each of which would explain its particular
relationship. If all the URIs have the same relationship then
Random-Headers1 can be used to describe it. If one wants to specify special
aspects of the N-Ary relationship then one can use Random-HeadersN. It is
totally generic. Furthermore I added a paragraph explaining why I put in
the pseudo-hierarchy:
By convention an attribute request which ends in a "_" and which does not
resolve to a specific attribute name SHOULD be treated as a request for a
list of all attributes in that hierarchy.

BTW, you should come out with a list of all the types of links you think we
will need. That way we can include them with the document.

>Thus attributes can now be used in any context where a normal URI would
be
>used.

Nifty. Hmmm... I wonder if this could be used as a general URL
syntactic composition mechanism. That issue was raised at the DOMC
workshop. (see http://www.w3.org/pub/WWW/OOP/).

I read the page but I'm not sure what you mean. So I'll bite... "What is a
Syntactic Composition Mechanism?"

>When a resource is copied, moved, or otherwise manipulated, its
attributes
>are equally effected.

s/effected/affected/.

Does this mean I have to give back my college degree? Oh wait, I'm an
engineer, we aren't supposed to be able to write well. =)

prefixes
>should be registered with a central authority.

:-{

We hate this. But I don't see any alternative in this case, at least right
now.

Oh there is an alternative but you won't like it. GUIDs!!!!!!!! Now,
doesn't IANA sound down right warm and cuddly? =)

>2.2 Standard Attributes
>
>AttributeDirectory
>     The attribute "AttributeDirectory" returns a list of all attribute

Subject to access control? Seems like it could be sensitive information.

But of course. All URLs are subject to access controls and

> To retrieve a list of attribute headers
>     associated with the URL http:\\foo\bar one would send a GET request

s,\\foo\bar,//foo/bar, you Windows freak! ;-)

Oh NO!!! You don't understand. I come from a long history of UNIX! I guess
this means, yes.. I'm afraid it does. I HAVE BEEN ASSIMILATED!!!!!!!

>     with a request-URI of \bar<AttributeDirectory>, where Host would
equal
>     foo.

>     associated with this resource. A SiteMap representation SHOULD be
>     available.
>     [TBD: Review the SiteMap format and figure out tag formats to define

subordinate
to attributes, rather than being at the same level. That rubs me the wrong
way.

For example, to get the stylesheet of a document at U, I'd rather GET
U<stylesheet>
than GET U<link> and parse the results, looking fro a stylesheet link.

IWhat stops you from using GET U<stylesheet>? It can either return a URL
which points to the style sheet or it can return the stylesheet itself
are attributes.

>Source
>     The exact contents of the resource as stored, without any processing
by
>     the server (e.g. without processing of server-side includes).

This should return the address of the source, if you ask me. It's just a
typed

I agree:
Source
The URI of the resource as stored, without any processing by the server
(e.g. without processing of server-side includes).

>3. Lock/Unlock
>
>Locks come in three types, write, read, and no-modify. Logically a write

s/types,/types:/

Fixed.

>lock and a read lock can co-exist on a single resource. This means that
one
>set of clients can alter the resource and another set are the only ones
>allowed to read it. This may seem silly but is actually used in Orange
book
>compliant environments. A write lock and a no-modify lock can not be used

Make "This may seem..." parenthetical. And do you have a citation for
the orange book?

Why should it be made parenthetical? Also the best reference I currently
have for the Orange book is [ORANGE] DoD 5200.28-STD, "Department of
Defense Trusted Computer System Evaluation Criteria", December, 1985. Do
government standards have authors?

>Locks are assigned to a subset of the representations available for a
>resource. If the lock only applies to a single representation then the
lock
>may be further restricted to only a particular range of the
representation.

This implies that there are "representations" that both the client and
the server can refer to, but not by name. Seems to me that in the
same way you consider attributes addressable, "representations" should
be considered addressable. In fact, it seems to me that in the HTTP

For example, if /foo is available in GIF and PNG, and the server ever
needs to export that fact to the client (e.g. for caching purposes),
it should make up names for them (for use in the Alternates: header,
for example). By convention, the server would probably pick /foo.gif
and /foo.png.

The bottom line: I suggest
A lock is assigned to a set of resources; one of the
resources in the set is the _primary_ resource;
the others are _alternates_. All resources in the set,
including the primary, are _variants_ of the primary.
In the case of zero alternates, the lock
may be further restricted to only a particular range
of the resource.

Using the above example, a lock could be assigned to the set:
{/foo, /foo.gif, and /foo.png}

with /foo as the representative. (Note that /foo.gif could be used
as the representative as well. It doesn't matter.) This lock
couldn't be a byte-range lock. But one could assign a lock
to the set
{/foo.gif}
and restrict that lock to a byterange. It's sensible (aka
well-defined) for a client to ask for a byterange lock on {/foo},
but depending on how a server manages the content, it might
result in a runtime error.

We already provide this type of semantics but instead of using the exported
headers but this isn't common practice. If it becomes common practice then
the clients are free to discover the exported headers and place their locks
on those. Until then we need to provide the content-negotation mechanism.

>Locks may be taken out either in exclusive or shared mode.

Add "shared mode" to the terminology section.

SIR, YES SIR!
shared mode
Shared mode modifies a lock request such that the lock may be shared
between multiple requestors.

> In shared mode
>anyone with proper access may take out a lock. In exclusive mode only the
>user(s)

What term is used in the HTTP spec for thingies that participate in
authentication?  They're called principals in a lot of the security
literature. But "user" is OK, as long as it's used consistently.

I don't like using the term user because sometimes the request may come
from an automated process.

It seems like you use client sometimes and user others. The notion of
client is definitely different from the notion of user/principal:
consider the case of the AOL proxy: it's one client making requests on
behalf of zillions of users/principals.

Sigh... I will change user and client to principal if you provide a
definition of principal for the terminology section.

> who originally took out the lock may alter the lock. However a new
>user can be added to an exclusive lock if the addition is performed by
one
>of the locks current owners.

The word "current" gives me the willies. Unless you're in the context
of discussing a particular timeframe, strike it.

I changed the language to "However a new user can be added to an exclusive
lock if the holder of the lock token performs the addition."

>If an entire resource is write locked and a lock owner deletes the
resource
>then the write lock remains. So long as the write lock remains the URI
can
>not be reused.

"reused" in what sense? I think you can be more precise by saying "can
not be updated".

I changed it from reused to edited.

>Locks also have time outs associated with them. If no time out value is
>associated with a lock then the lock will never time out. Otherwise the
lock
>will expire if a number of seconds equal to the time out value passes
>without the resource being accessed by a lock owner.

seconds by whos clock? We're in a distributed system here. If you stick
to the time-frame-of-reference of the server, I think you'll be safe.
The implication is that clients can't exactly determine when a lock
times out. But I think that's just a fact of life.

Yup.

>Finally, locks may be taken out for multiple clients in a single request.

That seems fuzzy to me: a request by definition involves exactly one
client. How do the other clients get involved? Or do you mean
multiple users?

>The Lock_Owners field allows for tokens to be used to identify multiple
>clients who are considered owners of the lock.

Ah... looks like you mean users. Be consistent.

As I said, I will fix client/user if you provide a definition of principal.
=)

>Locks will be implemented using POST.

Nifty.

Finally, someone who agrees with this position. Please go talk to Roy
Fielding for me.

>4. Name Space Manipulation
>
>4.1 Copy
>
>A copy performs a byte-for-byte duplication of a resource, making it
>available at both the original and new location in the URI namespace.
There
>is no guarantee that the result of a GET on the URL of the resource copy
>will be identical to a GET on the original resource. For example, copying
a
>script to a new location will often remove it from its intended
environment,
>and cause it to either not work, or produce erroneous output.

It seems to me that a smart server could detect this sort of thing. If
so, should it
(1) copy it anyway, cuz that's in the spirit of this spec, or
(2) signal an error to the client that requested the copy,
in order to prevent this sort of frotzed situation.

I think a smart server SHOULD to (2) if it can, and that we should give
that hint in the spec. I also think we should hint at whether a COPY
is indented to be a deep copy or a shallow copy, or explicitly say
that it may vary from resource to resource.

In short, I think the intent is to create a new resource that acts like
the old one, but doesn't share state with it (i.e. is updated
independently).

Well (2) should be handled in the return message. I hearby nominate you to
write that part of the spec, all in favor say "AYE". "AYE!" Well that takes
care of that. As for deep copy, shallow copy, please be more specific. I am
only familiar with that term in reference to recursive copying.

Gotta go... more comments when I get a chance to read the rest.

Dan

Promises promises... =)

Thanks for the great comments, even if it did take me two hours to make it