Date: Wed, 5 May 1999 10:11:26 -0400 Message-Id: <9905051411.AA07654@tantalum> From: "Geoffrey M. Clemm" <gclemm@tantalum.atria.com> To: sv@hunchuen.crystaliz.com Cc: ietf-dav-versioning@w3.org In-Reply-To: <007501be963d$36673ee0$d0acddcf@crystaliz.com> Subject: Re: Repository -- do we need it? From: "Sankar Virdhagriswaran" <sv@hunchuen.crystaliz.com> > A workspace provides a "version-selection" mechanism that > maps from versioned-resources to revisions and working-resources. A > repository holds the versioned-resources, activities, configurations, > and workspaces. ... Ignoring versioned resources for the moment, I think of activities, configurations, and workspaces as 'namespace service providers'. They provide particular kinds of organization on top of 'data'. I believe that "collections" are the only namespace service providers. Activities, configurations, and workspaces group things, but not for the purpose of giving them names. If we allow clients to access the repository through these 'namespace service providers', in an orthogonal fashion, then we may be able to achieve the orthogonal I was hoping for. For example, a sophisticated client can use the configuration namespace service provider and the activities namespace service provider to implement a change-set based consistency management system and can ignore workspaces. The resources being defined are designed to be orthogonal, but not redundant. So activities, configurations, and workspaces are not different flavors of the same thing, but rather very different things that serve different purposes. A workspace creates a namespace within which apropriate revisions appear with appropriate names. An activity defines a "logical change" to a set of resources. It is semantically a set of change arcs, but for convenience we represent it as the set of revisions that are the destination of those arcs. A configuration defines a set of revisions for historical recreation. If you carefully limit the set of methods that each of these objects are required to perform, the server can chose efficient implementations of them. If you blur the distinction between them by requiring that they each perform the function that the others provide, a server no longer can peform the optimizations that are essential if this protocol is going to be useful for large-scale applications. In particular, a "workspace" is currently proposed as the artifact the server can use to do efficient version selection on groups of resources. The fact that a client uses a workspace for a series of requests is the key characteristic that allows a server to cache information for re-use between requests. A server can chose to just implement a very lightweight workspace, and not bother to do any caching, but unless the protocol requires this workspace argument to operations, a server *cannot* effectively optimize for large scale applications by caching information between requests. Similarly, (as Geoff mentions) a simple browsing client can just allow 'managers' to browse the various namespaces for administrative purposes or build a way of navigating related changes (i.e., activities) and user tasks (i.e., workspaces) for what is often referred to as 'change tracking and management'. Yes, the metadata being proposed (especially activities) were carefully designed to facilitate key out-of-scope tasks like "change tracking". NOTE: some argue that even versioned resources should be implemented as based on an 'artifactual' system. What is "an implementation that is based on an artifactual system"? If that implementation strategy was chosen by some of the implementers, even versioned-resources data in the repository becomes a namespace. Versioned resources have URL's, and revisions have URL's, so they are visible in the URL namespace. But this is a protocol commitment, not an implementation strategy. Probably I misunderstood your point. From watching the discussion in the DAV list I have a feeling that only a few implementers think this way about versioned resources, therefore I mention this point in a note rather than to substantiate the main point of supporting having a repository specification just based on orthogonality and extensibility. Just to reinforce my earlier point, orthogonality does not imply that you can have just one without the other. Sleeping and eating are orthogonal issues, but that doesn't mean that you can chose to just sleep and never eat, or vica versa. I think that my proposal is different from Geoff (am I right Geoff?). I think he wanted to have a repository (as in a database) specified as part of the protocol for mostly 'administrative' uses. I think I am actually extending his idea to have the repository as a set of namespaces that can be used by DELTA-V clients in an orthogonal fashion to perform their function, not just use it for administrative UI purposes. I'd have to see more specifically what "extending a repository to be a set of namespaces" means, but if it means that a repository performs all the functions of a workspace, a configuration, and an activity, then that would be a very different proposal. PS: Also, if the advanced collections is specified correctly, these 'namespace service providers' can be implemented using advanced collections. Our implementation is based on such an architecture. We need to be a bit careful with the term "implementation". I believe you are talking about how the protocol is layered above existing protocols, and I definitely agree that collections (and especially, advanced collections) are the constucts that should be used to define the namespace protocol. But we need to be very careful to distinguish how the protocol is layered above existing protocols, from how the protocol is implemented on a server. In particular, the chance that the server that is designed to scale will be able to reuse its general collection implementation for a more sophisticated construct like a "configuration", is vanishingly small. This means that the server will need a separate configuration implementation, at which point any methods required by the collection protocol that are not really needed for the configuration protocol are actually a *burden* on the server implementor, not a benefit. Cheers, Geoff