Requirements for Collections

I am hoping to submit an internet draft on requirements for collections, to
ground the discussion at the LA IETF meeting.  Since the cutoff date for
internet drafts for that meeting is March 13, I'd like to have some
discussion of possible requirements on the mailing list before then.

To start that discussion, here is a re-send of the suggested requirements I
sent out last week, together with summaries of discussions I've had about
them with Jim Whitehead and Jim Davis.

The requirements are intended to cover both external membership in
collections and ordered collections.  I'm using different terminology here
from what we are accustomed to.  Instead of "internal members," I talk
about "direct members."  Instead of "external members," I talk about
"members-by-reference."  This is because the requirements stated here allow
a collection to contain a resource as a direct member, and also to contain
a reference to that same resource. 

MEMBERS-BY-REFERENCE

The point of supporting members-by-reference is to enable multiple
collections to share the same members

	Without each having to keep a physical copy of the member
	Changes in one place are guaranteed to be seen everywhere

Principles and Requirements

1. A resource is a direct member of only one collection.  

2. A resource may be referenced by many collections.

3. Operating on a reference to a resource does not produce any changes in
the resource itself.

4. Operating on a resource has does not produce any changes in any
references to it.  (Except that servers may, if they wish, maintain
referential integrity by updating references when their target resources
move or are deleted.)

5. Maintaining referential integrity is not required.

6. Preventing cycles is not required.

7. It is possible to add a member-by-reference to a collection.

8. It is possible to remove a member-by-reference from a collection.

9. It is possible for a member-by-reference to carry its own properties,
distinct from those of the resource it refers to.

Rationale: There are properties like "who added this resource to this
collection" and "when was this resource added to this collection" that
don't belong to the target resource, since it may be a member (by
reference) of many collections.  Instead, they belong to the
member-by-reference.

JIM WHITEHEAD: This requirement suggests a data model where a reference is
a special type of resource, where its state is the reference (the URL), but
which can have its own properties.

JUDY:  Yes.  This suggests that requirements 9 and 13 conflict.

10. A listing of the members of a collection should show both the direct
members and the members-by-reference of the collection.

11. For any collection member, it is possible to find out whether it is a
direct member or a member-by-reference.

12. It is possible for the same resource to be a member of a single
collection multiple times.

Rationale: The cases I can think of where this might be useful are all
cases of ordered collections.  So perhaps another way to supply the needed
functionality is to allow a resource to be a member of a single collection
only once, but allow it to appear multiple times in a single ordering of
the collection.  Anyhow, here's a case: A collection contains the readings
for a course.  The professor wants the students to read a particular
article once at the beginning of the semester, and then to re-read it at
the end.  Here's another: A collection contains the pages of a document,
one of which is a graph that needs to appear multiple times in the document.

13. Members by reference are not required to have names (URLs) relative to
the collection.

Rationale: Legacy applications that implement membership-by-reference
without assigning names to the references in the collection.

JIM WHITEHEAD:  Add the corollary requirement that members-by-reference may
have different names than the URL they reference (i.e., an in-collection
name, separate from the referred-to name).

14. It is possible for the same resource to be a member of a single
collection directly and by reference.

Rationale: See the rationale for 12.

15. By default, operations on members-by-reference affect the reference,
not the resource it refers to.  But wherever it makes sense, an option is
provided to let the client request that the operation apply to the target
resource.

Rationale: For simplicity and clarity, let the default behavior always be
the same.  For simplicity of implementation, let the default behavior be
NOT to resolve references.  In addition, there are some operations for
which resolving references does not make sense:  DELETE, COPY, MOVE.  So
let the default behavior be the one that makes sense for those operations.

(Alternatively, it's tempting to divide operations into three groups that
have different default behaviors.  For DELETE, MOVE, and COPY we never want
to resolve the references.  There's not even an option to resolve
references for them.  For listing / setting properties, and for retrieving
and updating content, most often the intent is probably to operate on the
target resource, so dereferencing might be the default behavior with an
option to operate on the reference instead.  For listing the members of a
collection and for LOCK, there's no clear reason to favor resolving
references or not, so maybe require the client to specify whether to
resolve references.)

JIM DAVIS: It might be best never to follow any references.  There will be
complexities around security if the target is on a different server.  Whose
credentials should be used in such a case?  Anyhow, if we allow references
to carry properties, it would be possible to store any properties likely to
be of interest to clients (including perhaps copies of the target
resource's properties) on the reference, so that the client would only have
to follow the reference to retrieve content.

JIM WHITEHEAD: Oy, this gets complicated.  Are there any general principles
which guide the case-by-case analysis?

Some might be:
- for operations with a source and a destination (MOVE, COPY), there is 
never an option to have the operation apply to the referred-to resource
- for operations which only have a single operand, there is an option to 
have the operation apply to the referred-to resource (your DELETE doesn't 
follow this, though, and LOCK is a different case as well, applying to the 
referred-to resource and the reference)

JUDY: Intuitively, I would expect DELETE, MOVE, and COPY not to touch the
referred-to resource.  I don't know why. Intuitively, I would expect GET,
PUT, PROPPATCH, and PROPFIND to affect the referred-to resource rather than
the reference.  I think this is because the whole reason for the reference
being in the collection is to give end users access to the referred-to
object through that collection.  End users will not want even to know
whether any particular resource is there by reference or directly.  Only
administrators care about that stuff. I don't know what I would expect for
LOCK.

16. Given a reference in a collection, it must be possible to obtain the
URL of the resource it references.  

Rationale: This will allow clients to resolve references themselves in
order to operate on the resources they point to.

----------------- 

Implications of 15, applied to existing operations that interact with
members-by-reference in some way are:

	DELETE on a collection results in all its direct members being deleted,
and in all the references being deleted, but has no effect on the resources
referenced.  There is NO option to have the delete apply to the target
resources.

	DELETE on a member-by-reference deletes the reference, not the target
resource.  There is NO option to have the delete apply to the target resource.

	COPY on a collection results in all its direct members being copied, and
in all the references being copied, but has no effect on the resources
referenced.  There is NO option to have the copy apply to the target
resources.

	COPY on a member-by-reference copies the reference, not the target
resource.  There is NO option to have the copy apply to the target resource.

	MOVE on a collection results in all its direct members being moved, and in
all the references being moved, but has no effect on the resources
referenced.  There is NO option to have the move apply to the target
resources.

	MOVE on a member-by-reference moves the reference, not the target
resource.  There is NO option to have the move apply to the target resource.

	A PROPFIND, for whatever depth is specified, should treat direct members
and members-by-reference the same.  If Depth = 0, it shows properties only
of the collection.  If Depth = 1, it shows the properties of both direct
members and references that are children of the collection.  There IS an
option to show the properties of the target resources rather than of the
references.  If Depth = infinity, by default PROPFIND does not resolve
references; so the hierarchy it produces will not include referenced
collections and their children.  However, if the option to resolve
references is turned on, the hierarchy does include referenced collections
and their children, and the properties displayed are the properties of the
target resources rather than the properties of the references themselves.

	PROPPATCH on a member by reference by default modifies the properties of
the reference.  But an option is available to resolve the reference and
modify the properties of the resource it points to.

	LOCK with Depth = 0 on a collection prevents adding / removing any
members, direct or by-reference.  LOCK with Depth = infinity by default
locks the collection, all of its direct members, and all of the references
in it, recursively.  It does not affect the target resources that the
references point to.  There IS an option to resolve the references and
apply the lock to the resources they point to IN ADDITION TO LOCKING THE
REFERENCES THEMSELVES.  (This breaks the consistency that's been maintained
so far, where the option to resolve references causes the target resource
to be affected INSTEAD OF the reference.)

	UNLOCK does whatever it takes to undo the lock specified by the lock token.

	GET on a member-by-reference should return the content of the reference
(might be the URL of the target resource, depending on how
membership-by-reference gets implemented).  There IS an option to resolve
the reference and return the content of the target resource.

	PUT on a member-by-reference should replace the content of the reference.
There IS an option to resolve the reference and replace the content of the
target resource.

JIM WHITEHEAD:  For GET and PUT, my inclination is to reverse the
semantics, having the reference be followed by default.

-------------------

ORDERED COLLECTIONS

JIM DAVIS:  We don't need ordered collections.  We already have properties,
which can have lists as values.  So clients can maintain any ordering they
need using DAV properties.

JUDY: DAV properties are certainly an obvious implementation of ordering.
But if we choose that approach, we need to define a well-known property
with standard syntax to represent orderings.  This is what will allow
different applications on different clients to understand and use each
others' orderings, and the server to use the ordering when responding to a
PROPFIND.

JIM WHITEHEAD:  You forgot to state your assumption that the server will
use the ordering of a collection when returning the members of a collection
with PROPFIND.

The collections specification will deal primarily with arbitrary orderings
that are not sorts based on property values.  The DASL specification will
presumably address sorting based on property values.  We should coordinate
with them to make sure that it is possible to save the result of "SELECT *
FROM /collection1/ ORDER BY author" as an ordering of the collection
/collection1/.

1. It is possible to order the members of a collection in an arbitrary way,
not based on property values.

Rationale:  Consider a collection of course readings for Computer Science
101.  Two different professors teach this course, and each prefers to have
the students do the readings in a different order.  This collection needs
two different orderings, neither based on any properties of the readings,
but just on what the professors think makes sense.

2. Internal and external members may be intermixed in a single ordering

3. It is not required that all collection members be included in an ordering

Rationale: In 1 above, one of the professors only assigns a subset of the
readings.  An alternative to this requirement would be to create two
separate collections (one of them has only members-by-reference), one of
which is a subset of the other.

4. It is possible to impose multiple orderings of the same collection

Rationale: See 1 above.  An alternative to this requirement would be to
create two separate collections (one of them has only
members-by-reference), each with a single ordering.

5. A collection member may be included in an ordering more than once

Rationale:  The professor may want the students to read an article early in
the course, and re-read it near the end.

6. Orderings are server-maintained, and cannot be directly accessed by clients

JIM WHITEHEAD: It's not clear what this means.  It may just be trying to
exclude the client-maintained property implementation, but in that case we
need to understand the rationale.  It may be related to the expectation
that the server will use the ordering when responding to PROPFIND. It may
be stating that the server should maintain the integrity of the ordering,
e.g., by removing members which are deleted, adding members which are PUT,
COPY'ed, MOVE'ed, etc.

JUDY: I'm definitely assuming that the server will use a collection
ordering when responding to a PROPFIND on the collection.  But it can do
this even if the orderings are client-maintained properties, provided that
we standardize the name of the ordering property, and its syntax.  

I'm really just expressing a gut level anxiety about the integrity of the
orderings, and the feeling that it might be better not to allow clients to
manipulate the orderings directly.  On one extreme, we might force clients
to use methods to ask the server to change orderings.  On the other extreme
we might make orderings a dead property on collections, entirely
client-maintained.  In between, we might make orderings live properties, to
some extent maintained or enforced by the server.

JIM DAVIS: Having the server maintain the orderings would protect the
orderings a little, but not enough to be worth the added complexity to the
specification.  Keep it simple, and let the orderings be client-maintained.

7. It is possible to add (internal or external) member before / after a
given URI in an ordering

8. It is possible to add (internal or external) member at a certain
sequential position in an ordering

9. It is possible to modify an ordering without pulling any resources
through the client.

JIM WHITEHEAD: Explain what this is about.

JUDY: If ordering is a property editable by a client, there is no problem
here.  To change the order, you just do a PROPPATCH.  If ordering is a
read-only property or clients use methods to manipulate orderings, I'm
suggesting there needs to be a REORDER method.  What we don't want is for a
client to have to GET a resource, then DELETE it from the collection, then
PUT it to the collection in the desired position in order to change the
ordering.

10. Defined ordering schemas?  Or at least a standard for defining ordering
schemas?

JIM WHITEHEAD: Explain what this is about.

JUDY: I'm trying to capture something about making the semantics of any
ordering discoverable.  If somebody other than the creator of the
collection adds a member, it would be nice for her to be able to figure out
where it makes sense to put it in the collection's ordering(s).

ADDITIONAL REQUIREMENTS FROM JIM WHITEHEAD:

11. It is possible to reorder all members of a collection with a single
request.

12. By default, members added to a collection without specifying a position
are added to the end of the order.

13. Deleting a resource removes it from the order.  So, a DELETE, followed
by a PUT will not result in the same ordering as before the DELETE.

JUDY:  Note that 12 and 13 assume some amount of server maintenance of
orderings.

ISSUES FROM JIM WHITEHEAD:

Is ordering a quality of the collection, or of the resource? So, if I lock
a resource, but not its containing collection, can I modify the order of
that resource?






Name:		Judith A. Slein
E-Mail:		slein@wrc.xerox.com
Phone:  	(716) 422-5169
Fax:		(716) 422-2938

Xerox Corporation
Mail Stop 105-50C
800 Phillips Road
Webster, NY 14580

Received on Friday, 20 February 1998 15:58:10 UTC