Date: Wed, 5 May 1999 22:53:35 -0400 Message-Id: <9905060253.AA07899@tantalum> From: "Geoffrey M. Clemm" <gclemm@tantalum.atria.com> To: sv@hunchuen.crystaliz.com Cc: ietf-dav-versioning@w3.org In-Reply-To: <004e01be971d$ab25a790$d0acddcf@crystaliz.com> Subject: Re: Repository -- do we need it? From: "Sankar Virdhagriswaran" <sv@hunchuen.crystaliz.com> > I believe that "collections" are the only namespace service providers. > Activities, configurations, and workspaces group things, but not for > the purpose of giving them names. Why not? (see below where I expand on the concept of my twist on repository) Collections, activities, configurations, and workspaces all refer to the same set of revisions. If more than one of them tries to give the same revision a name, it is inevitable that the name one of them gives it will conflict with the name another gives it. Which one wins? In contrast, if only one resource type is responsible for naming (i.e. the collection), then it is unambiguous what the name is, and where to get the name from. You go to the collection containing a resource, and that tells you one segment of the name. You go to the collection containing that collection, and that gives you the preceding segment, and so on until you get to the root of the URL tree. ... However, for those cases where this type of caching will actually create problems and for those cases where workspace concept actually gets in the way of designing appropriate clients I would like to have the ability to not use workspaces to perform the kind of selections. Can you describe on situations where caching would create problems, and on where the workspace concept would get in the way of designing appropriate clients? A workspace in this context is just a resource that can contain a "revision selection rule". The only difference between not using a workspace and using a workspace is that in the former case you pass the revisions-selection-rule in a header, while in the latter case, you PROPPATCH the revision-selection-rule into a workspace and then pass the name of thw workspace in the header. Other than occasionally doing two method calls instead of one, how does this get in the way of designing a client? > What is "an implementation that is based on an artifactual system"? Basically, each new state of any versioned entity gets a new, globally unique-id. In other words, each state is considered to be an 'artifact' (in the archeological sense of that word) - i.e., it is immutable. Version names, etc. are all overlaid on top of this unique-id scheme as a 'namespace'. This is certainly an implementation that will be supported by the protocol (since many of us use exactly that implementation, or one very like it). If it isn't, please do let us know so that we can fix it! > I'd have to see more specifically what "extending a repository to > be a set of namespaces" means, but if it means that a repository performs > all the functions of a workspace, a configuration, and an activity, then > that would be a very different proposal. Yup ;-). Just to explain about this notion of a collection of namespaces and the notion of namespace service provider, let me use the example of Java Naming and Directory Interface (JNDI). JNDI provides a common API for querying and navigating different (typically) graph structured namespaces which can be federated. They also have a way of implementing particular implementations of this API using 'service providers'. They have implemented service providers for navigating file systems, CORBA name space service objects, LDAP directories, etc. One can imagine implementing such a service provider for configurations, activities, and workspaces. In particular, workspaces are similar to LDAP service providers because LDAP service providers actually have to support sophisticated querying on the data they maintain. Actually, I believe it is a "collection" that is equivalent to a JNDI directory, not a workspace. A workspace is just a mechanism for mapping a versioned-resource to a revision or working-resource of that versioned-resource. A server uses the workspace in conjunction with an initial versioned-resource to successively map segments of a hierarchical name to versioned-collection revisions and then finally to a versioned-resource revision. So it is not the workspace, but rather the versioned-collection revisions that provide the name mapping. The workspace just provides the version-selection service. I have not been tracking the advanced collections spec. development as closely as you have been. So, may be I way off the mark. However, I thought the advanced collections spec. as it evolves looked more and more like JNDI (given the discussion about resources and how resources are actually mapped to different things and given the discussion about bind and unbind). So, I was making the suggestion about 'implementation' in the sense of JNDI service providers. Yes, advanced collections are very much like JNDI directories. (Although interestingly enough, JNDI only lets you apply properties to the directories, not to non-directory resources registered in those directories). That is, in my mind, the advanced collection spec. would provide a general set of protocols to create/modify/delete collections and to navigate them. Once could then 'implement service providers' that implement this protocol for different types of specific collections we care about (compound document collections, configurations, activities, etc.). Hope this helps in clarifying. Yes, for resources for which "add-member", "move-member", "delete-member", and "share-member" all are required, using the advanced collection protocol is the only sensible thing to do. But my position is that these methods should *not* be required for specifying the revisions selected by a configuration, so modeling a configuration as a collection of revisions would not allow the server to make the implementation choices essential for efficient configuration creation. Let's take a specific example. Suppose I were a branch-based server implementor. One of the most efficient ways to implement a "snapshot" operation is to just store the time-of-day and current branch as immutable properties of a snapshot resource. This works great if the only operation I need to support is "snapshot the contents of this workspace" (I can get the current branch from the workspace RSR, and the time-of-day from a system clock). But if I need to support "add/move/delete" configuration member, I'm out of luck. I don't want to be out of luck (:-). >This means that the > server will need a separate configuration implementation, at which > point any methods required by the collection protocol that are > not really needed for the configuration protocol are actually a > *burden* on the server implementor, not a benefit. This I don't agree. In general what you say is true - generality has burdens on specific implementations. I only care about specific cases where it would cause a problem. In general, I support the re-use of collection protocol wherever possible. (That's the basis for my "property-collection" proposal, i.e. whenever you have a property whose value acts like a collection, just use the collection protocol as the means to update it. However, there are advantages to going with the approach I am proposing. Client writers have to learn one 'regime'. We found this to be very useful when we did our implementation. Our client writers (who were not as sophisticated as our server writers) had to learn one way of creating/modifying/navigating different namespaces. Also, API's such as JNDI and CORBA collections spec. and Java 1.2 (i.e., Java 2) collections API actually show different ways of achieveing our objectives. Yes, that all sounds right to me. Imagine the other case. I need to educate our client writers with the basic DAV protocol methods, the advanced collections protocol methods, the configuration (collection) protocol methods, the activities (collection) protocol method, the workspace (collections) protocol methods (and DASL). Folks won't be able to swallow all the subtle differences between each of these methods. Yes, if there were a complex set of methods associated with each of these, that would be a mess for exactly the reasons you state. But I'm just concerned with one specific case (i.e. a configuration) for which supporting the collection protocol would make certain desireable implementations infeasible. Even after implementing a system such as the one that we are developing in DAV, I (personally) cannot keep all the different variations in my mind. This is partly due to terminology and partly due to spending only part time on the WEB-DAV activity. Still, I hope you see my point. Yes, I completely agree with you in principle, i.e. layer above existing protocol whenever possible. It's only for very specific cases (or here, just one very specific case) where a particular existing protocol is inappropriate for a particular resource type. thanks for listening As always, thank *you* for your time and interest! Cheers, Geoff