Comments on Requirements (was -- Re: Mtg. follow-up) from Judith Slein on 1996-10-04 (w3c-dist-auth@w3.org from October to December 1996)

From: Judith Slein <slein@wrc.xerox.com>
Date: Fri, 4 Oct 1996 12:20:44 PDT
To: Jim Whitehead <ejw@ics.uci.edu>
Cc: w3c-dist-auth@w3.org
Message-Id: <2.2.32.19961004192044.0035e64c@pop-server.wrc.xerox.com>
Here are the more detailed remarks that I promised you, on the requirements
from a document management point of view:

The Introduction to the requirements paper talks about interoperability of
tools.  To me this means I want my client application to be able to express
its requests to any vendor's Document Management Service on the Web in the
same way and expect the requests to be understood.  Similarly, the responses
coming back should be expressed in the same way.  This is not the case
today, where every vendor provides a Web interface, but they all use
different HTTP methods, URL decoration, etc., to achieve the same things.
So in some cases what's needed is recommendations for common practices,
guidelines for how to use HTTP for certain purposes.  In other cases the
functionality just isn't in HTTP, and needs to be added.

A different sort of goal I have for the Web is that it should be able to
provide a lot of the functionality that currently can only be provided using
commercial Document Management Services.  For example, let's provide
containers directly on the Web, and then people won't have to buy Documentum
or PCDocs or anything else to get that functionality.

There are some areas that I think are still missing from the requirements,
primarily containment and search.  

For search, I think what needs to be in HTTP is a standard way of
communicating a query and a result set.  Others will deal with such issues
as standard queryable attributes, standardizing search engines, etc.  But we
also need to be aware of each other's work.

As for containment, you may have been thinking that requirements 12 and 13
about URL hierarchies cover this, but I really think containers are
something different.  Containers are a way of grouping documents logically,
completely independent of where they are stored.  (I know, URLs are a step
above physical storage location, but not as far as I want to be.)  So for
example, a corporation may have several research labs.  Each lab wants to
keep control of its own papers, and so keeps them on its own Web server.
However, they all want a consolidated library of papers for browsing and
searching.  So they create a Materials Science collection that contains
(references) materials from all the labs, and a Digital Imaging collection
that contains (references) materials from all the labs, etc.  Each of these
collections can be browsed and searched independently, and users need not
know anything about where the papers actually reside.  The same paper may
appear in several collections.  It needs to be possible to create and delete
collections, add objects to collections and remove objects from collections,
view the contents of a collection, nest collections inside each other, set
authorizations at the collection level.  In short, all the same operations
as are needed for URL hierarchies -- it's just that I think containers are
farther removed from physical storage than URLs are.

There are two sorts of situations to think about, one where the containers
are part of a commercial Document Management System accessible via the Web,
and the other where they are directly part of the Web.

We also need to think about compound documents that are not HTML documents:
say, a Word master document and its component parts, which may have OLE
links to graphics, etc.
        -- Access to format interpreters in order to locate / retrieve /
insert / replace component parts
        -- Referential integrity
        -- Versioning:  What happens to the parent's version id if a direct
child changes?  If a referenced child changes?
        -- Access control at the component level
        -- If I lock the parent, do I lock all the children?  All the
referenced children?  If I lock a child, do I lock the parent?
        -- How to ensure that a copy gets all the pieces

Ora's scenarios also pointed up for me another requirement:  there needs to
be a way to insure that when one variant gets updated, the changes get
reflected in all the others.  We could say that this is the client's
problem.  But if conversion / translation services were available on the Web
or in a document management system on the Web, it would be preferable to let
the updates happen without the client having to worry about them.

In general, we need ways to request services from server-side filters
(format converters), interpreters, language translators, OCR services, etc.
These are often embedded in document management systems, but may also be
independent services.

Was link management / referential integrity ever here?  It seems preferable
to take this burden off the client.  Sophisticated document management
systems that work with compound documents do provide mechanisms for insuring
referential integrity.

Some more detailed comments on the requirements paper:

In the second paragraph of the Introduction, what's behind the Web may be
either a file system or a document management system.

2. Relationships:  Other uses of relationships include attaching annotations
to documents, attaching print instructions (duplex, staple, portrait, etc.)
to documents, creating compound documents with components of any formats.

3/4/5.  So the difference between a Write Lock that doesn't allow anyone to
modify a resource and a No-Modify lock on that resource is only that there
can be more than one No-Modify lock on it at the same time?  

In 4, you say that no-modify locks say that the contents of a resource
should not be modified until the READ lock is released.  Is this a typo?  It
seems to me no-modify should be independent of read.

9.  "complimentary" should be "complementary", I think.

12.  In a hierarchy listing request, it should be possible to specify which
attributes to return.  Even if all we care about is file system attributes,
we may want to see other things than name, media type, and last modified
date.  And then, of course, there may be other attributes available as well.

Treating MOVE as identical to RENAME may not be adequate in the case of
moving to a different server.  You may want to end up with the resource
physically residing on the second server.  Or did we agree to exclude that
case?  We also need to be sure that we can move URL hierarchies and version
series as atomic operations even between servers.

>At 03:39 PM 9/27/96 PDT, Jim Whitehead wrote:

>>So, do you feel the current requirements adequately meet the needs of
>>providing a moderately full-functioned front-end to document management?
>>I'm somewhat curious, since you were the only document management person at
>>the meeting.
>>
>>- Jim
>>

Don't forget Steve Carter of Novell, another document management specialist.

--Judy
Name:		Judith A. Slein
E-Mail:		slein@wrc.xerox.com
Phone:  	8*222-5169
Fax:		(716) 265-7133
MailStop:	128-29E
Received on Friday, 4 October 1996 15:19:49 UTC