- From: Saveen Reddy (Exchange) <saveenr@Exchange.Microsoft.com>
- Date: Thu, 8 Jan 1998 10:12:23 -0800
- To: www-webdav-dasl@w3.org, "'Jim Davis'" <jdavis@parc.xerox.com>
> ---------- > From: Jim Davis[SMTP:jdavis@parc.xerox.com] > Sent: Tuesday, January 06, 1998 4:12 PM > To: Saveen Reddy (Exchange); www-webdav-dasl@w3.org > Cc: w3c-dist-auth@w3.org > Subject: comments on Requirements Draft of Nov 19, 1997 > > Saveen this is a terrific beginning of the requirements. I agree with > your > overall sense of scope, particularly the things you excluded. > > I must also say that there are a lot of requirements in this document > whose > rationale (or perhaps semantics) I do not understand. Can you > enlighten me? > > Variants (3.1.4). What WebDAV group is working on "mechanisms ... to > use > when submitting variants to the server"? I must have missed this. > This is the result of my awkward phrasing. Let me try saying this another way ... In section 5.10 (entitled "Variants") of the webdav requirements it says "Detailed requirements for variants will be developed in a separate document.". So, I understand there is no separate Variant document at the moment, but I wanted to make explicit that whatever the model for submitting variants is going to be, DASL should work with it. > Regular Expressions (3.1.6) By "must" do you mean that every DASL > server > MUST support regex? If so, why? This seems too expensive to me. As > far > as I know, the large search engines (e.g. Verity) do not support > regex. > > Likewise for NEAR (3.1.7) Again, why is this mandatory? > I believe DASL must *allow* servers to perform such queries (this is what supporting multiple search syntaxes would allow), and DASL should provide guidelines to on what it means to do, for instance, a REGEX on a property values (there are, I believe, some internationalization issues lurking there). But, I'm not convinced every server MUST those operations for the minimum DASL capabilities. > Result Record Definition (3.2.1). I can certainly see the value of > supporting this - it improves performance by cutting round trips. > (otherwise, you do a SEARCH to get the list of resources that match, > then a > PROPFIND for each one.) > But is this the only reason, and should it be mandatory? > The performance in terms of round trips and bytes on the wire are certianly *the* big factors in pushing this to be a mandatory feature. As the number of properties set on objects increases, the impact just gets worse and worse. From my experience with systems that support properties, asking the for a specific set of properties such a common operation that systems they are very likely to support this kind of feature anyway. > Paged Search Results (3.2.3). I don't at all see why we need this. > If the > search results are returned in chunked Transer Encoding, then the > search > engine can start returning results as soon as the first match occurs, > and > the client can certainly start displaying them as they arrive. Or > perhaps > I don't know what this means. I hope it does not mean that the server > has > to store the state of searches in progress, as in Z39.50. > Chunked transfer encoding and paged results are related in the sense that they both break up data over multiple messages but there are some differences. I haven't gove through all the scenarios where paged table results are going to be used, but in general I think we see here that the client desires a lot of control that chunked does not give: - The client should be able to specify an exact number of records to be received. - The client might need specify which region of the complete result set it wants and this could happen in and order. This could be done completely out-of-order -- "give me the middle of the results, then the first, then the last". - Also, clients may want relative positioning - "give me ten records starting from the last 80% of the result set". And you are right, because multiple request messages are involved from the client concerning the same result set, the server and client need to have some form of state mechanism. I don't think paged table results are something that server MUST support, and I don't think the DASL protocol specification has to define this (maybe a separate draft, if ever), but whatever a DASL response looks like, it should be possible at some time in the future to page the results back out. Whether paged table results are being used or not, chunked is still going to be valuable. > Search Scope 3.3.1 - is a search scope a collection? Why do we need > this? > It's a performance improvement, so one does not have to issue N > searches? > Yes it is generally going to be a collection. (I supposed someone could try to search a single file, but that's certainly not that interesting). The ability to name multiple collections is (IMO) basically a performance improvement. I think it's a MUST that DASL show how to do "distributed" search -- one that does span multiple collections. But in the end, this may just be, as you point out, that client must issue N searches. > Search Depth 3.3.2 - by "container" do you mean "collection"? > Yes. > Extensible Query Syntax (3.4.2) - I am leery of this. Where does this > requirement come from? I challenge it in two ways > > 1 - it's not needed, because generic query syntaxes are sufficient. > Consider, for example, the DMA (document management alliance) API. > It > provides only one syntax, albeit a powerful one, unless by > "extensible" you > mean that the client can discover the list of searchable properties > and > operators allowed on each one. For this kind of discovery, you > should > look at DMA, which provides means of describing the operators, the > required > operands for each, the datatypes supported, the default values, and so > on. > I think that RDF is actually expressive enough to express all this. > By "extensible" I meant "discoverable". > 2 - it's not sufficient - I find it hard to believe that a client C > can do > discovery on server S and generate an effective query using a *syntax* > it > did not previously know, without the intervention of a human. If all > you > are trying to say is that server S should be allowed to provide > proprietary > search interfaces so that client C (from the same vendor) can work > with it, > that's neither difficult nor worthy of the spec. > I definitely agree with you that it's pretty darn unlikely for a client to discover and use any syntax of which it didn't already know. Do we both agree that discoverability is worthy of the spec? > If you have something different in mind, please correct my > misunderstanding. > > Internationalization 3.7. I strongly agree. I've certainly run into > problems in searching against non-ascii data, e.g. names in German or > Greek. (Did you think I would have only disagreements?) > > Finally, something needs to be said about full text variants - when a > document is stored with N variants, it's not clear to me which one(s) > a > full text search applies to. With some indexing systems, you don't > get to > choose. > I believe that ideally this works by the criteria on the search. For example, "give me all the resources that contain the word "Maenner" and that have Content-Language = 'DE'. And in this scenario, a system that cannot qualify its search by the criteria should fail the request. Actually, Jim, you point out what is probably our most vexing problem ... that DASL is going to have to live with the searching capabilities of storages, and sometimes capabilities in one storage just are not going to be there in others. > Likewise, if versioning is still in WebDAV (or should I say 'WebDA'?) > then > there must be some interaction with search. > Same problem ( multipled by 10, yikes! ) here. > I look forward to discussing these and become more enlightened. > > PS we should carry followup discussion on www-webdav-dasl only, I > think. > > > Thanks, Saveen
Received on Thursday, 8 January 1998 13:12:45 UTC