RE: Prelim. DAV spec.

-----Original Message-----
From:	Daniel W. Connolly [SMTP:connolly@beach.w3.org]
Sent:	Saturday, October 26, 1996 12:16 PM
To:	Jim Whitehead
Cc:	w3c-dist-auth@w3.org
Subject:	Re: Prelim. DAV spec.


Great stuff... comments as I read it.

In message <9610251749.aa07659@paris.ics.uci.edu>, Jim Whitehead writes:
><draft-ietf-webdav-v1-spec-00>                         October 25, 1996

>1.2 Terminology
>
>Unless otherwise noted below, the use of terminology in this document is
>consistent with the definitions of terms given in [HTTP11].
>
>check in
>     A Check In is a declaration that the client no longer intends to edit 
a
>     representation(s).

Of "entity" and "representation": pick one and stick with it.

But in this case, I think you meant resource, i.e. the gizmo associated
with a URL.

No we mean representation(s), meaning one or more than one representations 
of a resource. Where representation is defined as in HTTP 1.1.

representation
  An entity included with a response that is subject to content
  negotiation, as described in section 12. There may exist multiple
  representations associated with a particular response status.

The key here is that we are not just talking about an entity, we are 
talking about a content negotiated entity. I have removed all references to 
entity and replaced them with representation.

Of edit and update, pick one.

So I think it should say "... no longer intends to update a resource."

Update is a loaded term. I prefer edit. I have modified the draft to use 
edit.


>history
>     The history of a URI is a list of all the versions of the URI along
>     with related information.

"Versions of a URI"? URIs are immutable. They're just strings, kinda
like large integers.

I suggest:
	The history of a URI U* is a list of URIs Ui where each Ui refers
	to some version of the U* resource, plus some attribues related to
	each version.

This goes into the same bag as Destroy which is now defined as:
destroy
To destroy a resource is to request that the resource be permanently 
removed from storage. This differs from delete in that some versioning 
systems handle delete as a request to no longer make the specified resource 
editable.
I have also changed history to:
history
The history of a resource is a list of all the versions of the resource 
along with related information.
Of course each version is also a resource. However I don't want to define 
version as it would be nothing but an academic exercise.


>merge
>     A merge is the process whereby a resource represented by one URI is
>     combined with a resource represented by a second URI. Merges can 
occur
>     at the client or the server.

@@hmmm...

merge
A merge is the process whereby information from one or more resources is 
used to produce a new resource that represents the content of the component 
resources. Merges can occur at the client or the server.

BTW, I know you can nit pick the merge definition. Please don't. I really 
don't want to spend an hour finding just the right adjective.

>no-modify lock
>     A no-modify lock prevents a locked resource from being altered until
>     all no-modify locks are released.

s/altered/updated/. Be consistent.

Fair enough, all the "alter" have been changed to "edit".

>notify request
>     A notify request instructs the recipient to send update information
>     regarding the progress of a request.

If you use update in the sense that I'm suggesting, don't use it for
this purpose as well. I suggest: s/update/status/.

notify request
A notify request instructs the recipient to send information regarding the 
progress of a request.

>server diff
>     A server diff is a mechanism whereby the server compares two or more
>     representations, and sends the client a message containing a summary 
of
>     the differences between the entities.

s/representations/entities/.

server diff
A server diff is a mechanism whereby the server compares two or more 
representations and sends the client a message containing the differences. 


>2. Attributes
>
>It is often necessary to record meta data about a resource. The natural
>place to store such meta data is in HTTP headers however this presents a
>problem. HTTP headers, as defined in [HTTP11], are used either for 
content
>description or communication. While communication headers can be 
retrieved
>with the Option method, content description headers can not.

Content description headers can be accessed with HEAD.

Yes, but one can not specify which content headers will be transmitted. 
Where we to use content headers for attributes and were we to use HEAD to 
discover them, then we would have to transmit all the content headers with 
each HEAD request. As the number of attributes will be very large and as 
the size of each attribute is likely to also be large, this is not 
acceptable.

I have changed the paragraph to read:
It is often necessary to record meta data about a resource. The natural 
place to store such meta data is in HTTP headers however this presents a 
problem. HTTP headers, as defined in [HTTP11], are used either for content 
description or communication. Neither of these header types were designed 
for remote setting and on-demand retrieval.


> In addition
>neither of these headers were designed for remote setting and on-demand
>retrieval.

I disagree. Jigsaw supports setting and on-demand retrieval of
entity headers, for example.

How? Does it allow me to selectively specify which header I want? Will it 
let me perform a method on a header?

>What is needed is a third type of header, a header which provides
>information about the resource's nature, not its content or transmission
>state. This type of header will be referred to as an attribute header.

I suggest you merget the concept of attribute header into the
existing HTTP entity header concept.

That is unfortunately not practical. The semi-practical solution to the 
attributes problem is to define a new entity-header "attributes" whose 
value is a URL. That URL then points to a file which contains a whole list 
of attributes, each of which contains a value, which could potentially also 
be a URI. If I wanted to find out what the name of the author of a document 
is I would send a HEAD request, then do a GET on the attribute URI. I would 
then read the response and see if the Author attribute is present. If so I 
would either read the associated value or, if the value is a URI, do a GET 
on the URI to get the Author information. This means that at a minimum 
every request for attribute information would involve at least two HTTP 
requests and would require the downloading of a potentially very large 
file. In all other respects this new attribute means is as powerful as the 
one currently proposed. The problem is the two requests. That is too high 
an overhead to pay.

>2.1 Attribute Syntax
>
>To support requests for attributes the definition of a URI must be 
altered
>as follows:
>
>URI = ( absoluteURI | relativeURI ) ["<" Attribute ">"] ["#" fragment]
>Attribute = field-name ; See section 4.2 of [HTTP11]

Interesting... can attributes be links as well? In a draft I'm working on:

ABSOLUTELY!!! This has been one of Jim's basic requirements since day one. 
He wants N-Ary links. We have a small disagreement on syntax but we are in 
firm agreement regarding semantics.

	Describing and Linking Web Resources
		W3C Note @@Date
	This version:
		(unpublished)
		$Id: NOTE-link.html,v 1.5 1996/10/20 20:49:49 connolly Exp $
	Latest version:
		http://www.w3.org/pub/WWW/Architecture/NOTE-link

I use the notation R(A,B) to refer to any number of similar structures:

	link-relationship(source-anchor-address, target-anchor-address)
e.g.		stylesheet(http://www.w3.org/, http://www.w3.org/houststyle.css)
	header-field-name(URL, header-field-value)
e.g.		Expires(http://www.w3.org/TR/WD-xxx, "Wed Oct 19 15:34:54 1996")
	attribute-name(URL, attribute-value)
e.g.		hair-color(http://www.w3.org/People/Connolly/, "brown")
	two-place-predicate(arg1, arg2)
e.g.		member(http://www.w3.org/People/Connolly/, 
http://www.w3.org/People/)

This URI notation combines R and A into one. So I might denote the above 
examples by:
	http://www.w3.org/<stylesheet> = http://www.w3.org/houststyle.css
	http://www.w3.org/TR/WD-xxx<Expires> = "Wed Oct 19 15:34:54 1996"
	http://www.w3.org/People/Connolly/<hair-color> = "brown"
	http://www.w3.org/People/Connolly/<member> = http://www.w3.org/People/

The last example reminds me of an issue: Header fields and link
relationship are usually 1-1, but not always: the member relationship
is obviously many-many. I think HTTP header fields are multiply valued;
e.g. Accept: x,y is the same as Accept: x Accept: y. I can't think
of any multiply valued entity headers though.

This brings us to N-Ary Links. When one does a request on an attribute 
header the response is in the entity-body. As such the value can be 
ANYTHING. In the case of N-Ary links the response would be something like:
Random-Headers1 #(URI Random-HeadersN)
I say "like" because Jim and I still need to talk. The idea is that you 
would have a bunch of URIs, each of which would explain its particular 
relationship. If all the URIs have the same relationship then 
Random-Headers1 can be used to describe it. If one wants to specify special 
aspects of the N-Ary relationship then one can use Random-HeadersN. It is 
totally generic. Furthermore I added a paragraph explaining why I put in 
the pseudo-hierarchy:
By convention an attribute request which ends in a "_" and which does not 
resolve to a specific attribute name SHOULD be treated as a request for a 
list of all attributes in that hierarchy.

BTW, you should come out with a list of all the types of links you think we 
will need. That way we can include them with the document.


>Thus attributes can now be used in any context where a normal URI would 
be
>used.

Nifty. Hmmm... I wonder if this could be used as a general URL
syntactic composition mechanism. That issue was raised at the DOMC
workshop. (see http://www.w3.org/pub/WWW/OOP/).

I read the page but I'm not sure what you mean. So I'll bite... "What is a 
Syntactic Composition Mechanism?"


>When a resource is copied, moved, or otherwise manipulated, its 
attributes
>are equally effected.

s/effected/affected/.

Does this mean I have to give back my college degree? Oh wait, I'm an 
engineer, we aren't supposed to be able to write well. =)

>In order to prevent name space collisions both headers and header 
prefixes
>should be registered with a central authority.

:-{

We hate this. But I don't see any alternative in this case, at least right 
now.

Oh there is an alternative but you won't like it. GUIDs!!!!!!!! Now, 
doesn't IANA sound down right warm and cuddly? =)

>2.2 Standard Attributes
>
>AttributeDirectory
>     The attribute "AttributeDirectory" returns a list of all attribute
>     headers on a resource.

Subject to access control? Seems like it could be sensitive information.

But of course. All URLs are subject to access controls and 
attribute-headers are retrieved using URLs.

> To retrieve a list of attribute headers
>     associated with the URL http:\\foo\bar one would send a GET request

s,\\foo\bar,//foo/bar, you Windows freak! ;-)

Oh NO!!! You don't understand. I come from a long history of UNIX! I guess 
this means, yes.. I'm afraid it does. I HAVE BEEN ASSIMILATED!!!!!!!

>     with a request-URI of \bar<AttributeDirectory>, where Host would 
equal
>     foo.

>Link
>     This attribute header contains information about resources that are
>     associated with this resource. A SiteMap representation SHOULD be
>     available.
>     [TBD: Review the SiteMap format and figure out tag formats to define
>     source links.]

Hmmm... this answers my question above: in this design, links are 
subordinate
to attributes, rather than being at the same level. That rubs me the wrong
way.

For example, to get the stylesheet of a document at U, I'd rather GET 
U<stylesheet>
than GET U<link> and parse the results, looking fro a stylesheet link.

IWhat stops you from using GET U<stylesheet>? It can either return a URL 
which points to the style sheet or it can return the stylesheet itself 
(depending upon how the attribute is defined). Attributes are links, links 
are attributes.

>Source
>     The exact contents of the resource as stored, without any processing 
by
>     the server (e.g. without processing of server-side includes).

This should return the address of the source, if you ask me. It's just a 
typed
link, ala stylesheet.

I agree:
Source
The URI of the resource as stored, without any processing by the server 
(e.g. without processing of server-side includes).


>3. Lock/Unlock
>
>Locks come in three types, write, read, and no-modify. Logically a write

s/types,/types:/

Fixed.

>lock and a read lock can co-exist on a single resource. This means that 
one
>set of clients can alter the resource and another set are the only ones
>allowed to read it. This may seem silly but is actually used in Orange 
book
>compliant environments. A write lock and a no-modify lock can not be used

Make "This may seem..." parenthetical. And do you have a citation for
the orange book?

Why should it be made parenthetical? Also the best reference I currently 
have for the Orange book is [ORANGE] DoD 5200.28-STD, "Department of 
Defense Trusted Computer System Evaluation Criteria", December, 1985. Do 
government standards have authors?


>Locks are assigned to a subset of the representations available for a
>resource. If the lock only applies to a single representation then the 
lock
>may be further restricted to only a particular range of the 
representation.

This implies that there are "representations" that both the client and
the server can refer to, but not by name. Seems to me that in the
same way you consider attributes addressable, "representations" should
be considered addressable. In fact, it seems to me that in the HTTP
spec, representations _are_ considered addressable.

For example, if /foo is available in GIF and PNG, and the server ever
needs to export that fact to the client (e.g. for caching purposes),
it should make up names for them (for use in the Alternates: header,
for example). By convention, the server would probably pick /foo.gif
and /foo.png.

The bottom line: I suggest
	A lock is assigned to a set of resources; one of the
	resources in the set is the _primary_ resource;
	the others are _alternates_. All resources in the set,
	including the primary, are _variants_ of the primary.
	In the case of zero alternates, the lock
	may be further restricted to only a particular range
	of the resource.

Using the above example, a lock could be assigned to the set:
	{/foo, /foo.gif, and /foo.png}

with /foo as the representative. (Note that /foo.gif could be used
as the representative as well. It doesn't matter.) This lock
couldn't be a byte-range lock. But one could assign a lock
to the set
	{/foo.gif}
and restrict that lock to a byterange. It's sensible (aka
well-defined) for a client to ask for a byterange lock on {/foo},
but depending on how a server manages the content, it might
result in a runtime error.

We already provide this type of semantics but instead of using the exported 
names we use the headers. I understand your point regarding exported 
headers but this isn't common practice. If it becomes common practice then 
the clients are free to discover the exported headers and place their locks 
on those. Until then we need to provide the content-negotation mechanism.


>Locks may be taken out either in exclusive or shared mode.

Add "shared mode" to the terminology section.

SIR, YES SIR!
shared mode
Shared mode modifies a lock request such that the lock may be shared 
between multiple requestors.



> In shared mode
>anyone with proper access may take out a lock. In exclusive mode only the
>user(s)

What term is used in the HTTP spec for thingies that participate in
authentication?  They're called principals in a lot of the security
literature. But "user" is OK, as long as it's used consistently.

I don't like using the term user because sometimes the request may come 
from an automated process.

It seems like you use client sometimes and user others. The notion of
client is definitely different from the notion of user/principal:
consider the case of the AOL proxy: it's one client making requests on
behalf of zillions of users/principals.

Sigh... I will change user and client to principal if you provide a 
definition of principal for the terminology section.

> who originally took out the lock may alter the lock. However a new
>user can be added to an exclusive lock if the addition is performed by 
one
>of the locks current owners.

The word "current" gives me the willies. Unless you're in the context
of discussing a particular timeframe, strike it.

I changed the language to "However a new user can be added to an exclusive 
lock if the holder of the lock token performs the addition."

>If an entire resource is write locked and a lock owner deletes the 
resource
>then the write lock remains. So long as the write lock remains the URI 
can
>not be reused.

"reused" in what sense? I think you can be more precise by saying "can
not be updated".

I changed it from reused to edited.

>Locks also have time outs associated with them. If no time out value is
>associated with a lock then the lock will never time out. Otherwise the 
lock
>will expire if a number of seconds equal to the time out value passes
>without the resource being accessed by a lock owner.

seconds by whos clock? We're in a distributed system here. If you stick
to the time-frame-of-reference of the server, I think you'll be safe.
The implication is that clients can't exactly determine when a lock
times out. But I think that's just a fact of life.

Yup.

>Finally, locks may be taken out for multiple clients in a single request.

That seems fuzzy to me: a request by definition involves exactly one
client. How do the other clients get involved? Or do you mean
multiple users?

>The Lock_Owners field allows for tokens to be used to identify multiple
>clients who are considered owners of the lock.

Ah... looks like you mean users. Be consistent.

As I said, I will fix client/user if you provide a definition of principal. 
=)

>Locks will be implemented using POST.

Nifty.

Finally, someone who agrees with this position. Please go talk to Roy 
Fielding for me.

>4. Name Space Manipulation
>
>4.1 Copy
>
>A copy performs a byte-for-byte duplication of a resource, making it
>available at both the original and new location in the URI namespace. 
There
>is no guarantee that the result of a GET on the URL of the resource copy
>will be identical to a GET on the original resource. For example, copying 
a
>script to a new location will often remove it from its intended 
environment,
>and cause it to either not work, or produce erroneous output.

It seems to me that a smart server could detect this sort of thing. If
so, should it
	(1) copy it anyway, cuz that's in the spirit of this spec, or
	(2) signal an error to the client that requested the copy,
		in order to prevent this sort of frotzed situation.

I think a smart server SHOULD to (2) if it can, and that we should give
that hint in the spec. I also think we should hint at whether a COPY
is indented to be a deep copy or a shallow copy, or explicitly say
that it may vary from resource to resource.

In short, I think the intent is to create a new resource that acts like
the old one, but doesn't share state with it (i.e. is updated 
independently).

Well (2) should be handled in the return message. I hearby nominate you to 
write that part of the spec, all in favor say "AYE". "AYE!" Well that takes 
care of that. As for deep copy, shallow copy, please be more specific. I am 
only familiar with that term in reference to recursive copying.

Gotta go... more comments when I get a chance to read the rest.

Dan

Promises promises... =)

Thanks for the great comments, even if it did take me two hours to make it 
through. Please also get me the rest of your comments ASAP. I would like 
them to be in the next release.

				Yaron

Received on Thursday, 31 October 1996 02:20:12 UTC