Requirements for External Members and Ordered Collections

I'd like to take up Jim's excellent suggestion that we think about
requirements for the collections specification, intended to cover both
external membership in collections and ordered collections.  Here are some
suggestions for requirements, divided into two groups.  The first group
concerns external members of collections.  The second group concerns
ordered collections.

I'm using different terminology here from what we are used to.  Instead of
"internal members," I talk about "direct members."  Instead of "external
members," I talk about "members-by-reference."  This is because the
requirements stated here allow a collection to contain a resource as a
direct member, and also to contain a reference to that same resource. 

MEMBERS-BY-REFERENCE

The point of supporting members-by-reference is to enable multiple
collections to share the same members

	Without each having to keep a physical copy of the member
	Changes in one place are guaranteed to be seen everywhere

Principles and Requirements

1. A resource is a direct member of only one collection.  

2. A resource may be referenced by many collections.

3. Operating on a reference to a resource does not produce any changes in
the resource itself.

4. Operating on a resource has does not produce any changes in any
references to it.  (Except that servers may, if they wish, maintain
referential integrity by updating references when their target resources
move or are deleted.)

5. Maintaining referential integrity is not required.

6. Preventing cycles is not required.

7. It is possible to add a member-by-reference to a collection.

8. It is possible to remove a member-by-reference from a collection.

9. It is possible for a member-by-reference to carry its own properties,
distinct from those of the resource it refers to.

Rationale: There are properties like "who added this resource to this
collection" and "when was this resource added to this collection" that
don't belong to the target resource, since it may be a member (by
reference) of many collections.  Instead, they belong to the
member-by-reference.

10. A listing of the members of a collection should show both the direct
members and the members-by-reference of the collection.

11. For any collection member, it is possible to find out whether it is a
direct member or a member-by-reference.

12. It is possible for the same resource to be a member of a single
collection multiple times.

Rationale: The cases I can think of where this might be useful are all
cases of ordered collections.  So perhaps another way to supply the needed
functionality is to allow a resource to be a member of a single collection
only once, but allow it to appear multiple times in a single ordering of
the collection.  Anyhow, here's a case: A collection contains the readings
for a course.  The professor wants the students to read a particular
article once at the beginning of the semester, and then to re-read it at
the end.  Here's another: A collection contains the pages of a document,
one of which is a graph that needs to appear multiple times in the document.

13. Members by reference are not required to have names (URLs) relative to
the collection.

Rationale: Legacy applications that implement membership-by-reference
without assigning names to the references in the collection.

14. It is possible for the same resource to be a member of a single
collection directly and by reference.

Rationale: See the rationale for 12.

15. By default, operations on members-by-reference affect the reference,
not the resource it refers to.  But wherever it makes sense, an option is
provided to let the client request that the operation apply to the target
resource.

Rationale: For simplicity and clarity, let the default behavior always be
the same.  For simplicity of implementation, let the default behavior be
NOT to resolve references.  In addition, there are some operations for
which resolving references does not make sense:  DELETE, COPY, MOVE.  So
let the default behavior be the one that makes sense for those operations.

(Alternatively, it's tempting to divide operations into three groups that
have different default behaviors.  For DELETE, MOVE, and COPY we never want
to resolve the references.  There's not even an option to resolve
references for them.  For listing / setting properties, and for retrieving
and updating content, most often the intent is probably to operate on the
target resource, so dereferencing might be the default behavior with an
option to operate on the reference instead.  For listing the members of a
collection and for LOCK, there's no clear reason to favor resolving
references or not, so maybe require the client to specify whether to
resolve references.)

16. Given a reference in a collection, it must be possible to obtain the
URL of the resource it references.  

Rationale: This will allow clients to resolve references themselves in
order to operate on the resources they point to.

----------------- 

Implications of 15, applied to existing operations that interact with
members-by-reference in some way are:

	DELETE on a collection results in all its direct members being deleted,
and in all the references being deleted, but has no effect on the resources
referenced.  There is NO option to have the delete apply to the target
resources.

	DELETE on a member-by-reference deletes the reference, not the target
resource.  There is NO option to have the delete apply to the target resource.

	COPY on a collection results in all its direct members being copied, and
in all the references being copied, but has no effect on the resources
referenced.  There is NO option to have the copy apply to the target
resources.

	COPY on a member-by-reference copies the reference, not the target
resource.  There is NO option to have the copy apply to the target resource.

	MOVE on a collection results in all its direct members being moved, and in
all the references being moved, but has no effect on the resources
referenced.  There is NO option to have the move apply to the target
resources.

	MOVE on a member-by-reference moves the reference, not the target
resource.  There is NO option to have the move apply to the target resource.

	A PROPFIND, for whatever depth is specified, should treat direct members
and members-by-reference the same.  If Depth = 0, it shows properties only
of the collection.  If Depth = 1, it shows the properties of both direct
members and references that are children of the collection.  There IS an
option to show the properties of the target resources rather than of the
references.  If Depth = infinity, by default PROPFIND does not resolve
references; so the hierarchy it produces will not include referenced
collections and their children.  However, if the option to resolve
references is turned on, the hierarchy does include referenced collections
and their children, and the properties displayed are the properties of the
target resources rather than the properties of the references themselves.

	PROPPATCH on a member by reference by default modifies the properties of
the reference.  But an option is available to resolve the reference and
modify the properties of the resource it points to.

	LOCK with Depth = 0 on a collection prevents adding / removing any
members, direct or by-reference.  LOCK with Depth = infinity by default
locks the collection, all of its direct members, and all of the references
in it, recursively.  It does not affect the target resources that the
references point to.  There IS an option to resolve the references and
apply the lock to the resources they point to IN ADDITION TO LOCKING THE
REFERENCES THEMSELVES.  (This breaks the consistency that's been maintained
so far, where the option to resolve references causes the target resource
to be affected INSTEAD OF the reference.)

	UNLOCK

	GET on a member-by-reference should return the content of the reference
(might be the URL of the target resource, depending on how
membership-by-reference gets implemented).  There IS an option to resolve
the reference and return the content of the target resource.

	PUT on a member-by-reference should replace the content of the reference.
There IS an option to resolve the reference and replace the content of the
target resource.

-------------------

ORDERED COLLECTIONS

Deal primarily with arbitrary orderings that are not sorts based on
property values.  The DASL specification will presumably address sorting
based on property values.  We should coordinate with them to make sure that
it is possible to save the result of "SELECT * FROM /collection1/ ORDER BY
author" as an ordering of the collection /collection1/.

1. It is possible to order the members of a collection in an arbitrary way,
not based on property values.

Rationale:  Consider a collection of course readings for Computer Science
101.  Two different professors teach this course, and each prefers to have
the students do the readings in a different order.  This collection needs
two different orderings, neither based on any properties of the readings,
but just on what the professors think makes sense.

2. Internal and external members may be intermixed in a single ordering

3. It is not required that all collection members be included in an ordering

Rationale: In 1 above, one of the professors only assigns a subset of the
readings.  An alternative to this requirement would be to create two
separate collections (one of them has only members-by-reference), one of
which is a subset of the other.

4. It is possible to impose multiple orderings of the same collection

Rationale: See 1 above.  An alternative to this requirement would be to
create two separate collections (one of them has only
members-by-reference), each with a single ordering.

5. A collection member may be included in an ordering more than once

Rationale:  The professor may want the students to read an article early in
the course, and re-read it near the end.

6. Orderings are server-maintained, and cannot be directly accessed by clients

7. It is possible to add (internal or external) member before / after a
given URI in an ordering

8. It is possible to add (internal or external) member at a certain
sequential position in an ordering

9. It is possible to modify an ordering without pulling any resources
through the client.

10. Defined ordering schemas?  Or at least a standard for defining ordering
schemas?



Name:		Judith A. Slein
E-Mail:		slein@wrc.xerox.com
Phone:  	(716) 422-5169
Fax:		(716) 422-2938

Xerox Corporation
Mail Stop 105-50C
800 Phillips Road
Webster, NY 14580

Received on Thursday, 12 February 1998 17:31:50 UTC