- From: Jim Whitehead <ejw@ics.uci.edu>
- Date: Mon, 26 Apr 1999 14:14:04 -0700
- To: www-webdav-dasl@w3.org
Well, I have completed a review of the DASL protocol specification, something I've been meaning to do for a long time, and I've attached my comments to this message. Before getting into the meat of the comments, I wanted to thank the authors of this specification, as well as Alex, for doing a super job taking the spec. this far. As-is, the specification defines a very useful search facility that will greatly enhance the capabilities of clients interacting with WebDAV servers. Searching is a very difficult topic, and the deep experience of the authoring team in searching definitely shows in this draft. I'm really glad you've taken it this far, and I look forward to working with you by giving more feedback in the future until the spec. ships. However, like any complex specification, especially one which is fitting into the HTTP/DAV world, this specification has some areas which, if addressed, will tremendously improve the protocol. So, I submit these comments in the spirit of improving the DASL protocol specification. Since some of these issues may require further discussion, I ask that when you reply, please don't quote the entire message in your reply, and please divide the responses into "issue-sized" chunks. Changing the subject line to something more meaningful to each issue would also help. OK, here goes: Comments on draft-reddy-dasl-protocol-04.txt (Nov. 18, 1999): --------- Comments: --------- * Section 2.2.2 states that the server MUST recognize a text/xml request, and may understand requests transported in other content types. This section should reference RFC 2376 (XML Media Types) as giving correct guidance on packaging XML. It should also make it a MUST for servers to understand application/xml as well, since it is possible that text/xml may be deprecated in the future, and since both text/xml and application/xml are supported by RFC 2376. * This specification essentially defines a new type of Web resource, of type "search arbiter". This raises a number of questions regarding how this kind of resource interacts with existing HTTP methods. I would expect to see a section which goes through and details the interactions between HTTP and WebDAV methods and search arbiters. For example, it seems reasonable to me to allow a search arbiter to potentially reply to GET (perhaps with a human-meaningful description of the capabilities of the arbiter), and for this GET response to potentially be authorable using PUT, and locked using LOCK. However, I wouldn't expect COPY, MOVE, or DELETE to work, although I would expect PROPPATCH and PROPFIND to work OK. Another issue is what kind of resource type a search arbiter returns in the resourcetype property (I'd expect a <searcharbiter/> element). * How does a search arbiter respond to searches, if the search arbiter URI is within a search scope? The answer to this is related to the answer to whether a search arbiter has its own properties, body, etc. * Section 2.5 states that the 507 (Insufficient Storage) status code should be returned when SEARCH produces more responses that the server is willing to immediately return. A 5xx status code isn't appropriate for this case, since the response does have valid search results, indicating that the client correctly submitted a search, and this search was successfully performed by the server, even if it isn't returning all search results. I recommend defining a new status code for this case, 208 (Partial Results). * On the topic of partial search results, DASL currently has no way for a client to request the next chunk of a set of search results. Since *every* search service I've interacted with on the Internet has a feature for returning the next set of search results, I really would expect this feature to be in DASL. An explanation for why this feature isn't present should be in the protocol specification if it is not going to be supported. * I would expect the SEARCH method to return a 102 (Processing) response code if the server is taking a long time (over N seconds, for smallish N) to perform the search. * Can a SEARCH be redirected by a 301/302 response? I see no reason not to, unless it would expose privacy concerns. I know there are facilities in place for arbiter redirection, but it still could occur that a SEARCH would get issued to a URL that responds with a 301 or a 302. If SEARCH can be redirected, does it make sense for arbiter redirection to be handled by the 301/302 mechanism too? * Is the response from SEARCH cacheable? The spec. is silent on this point. * How does a DAV client discover which search arbiter can be used to search a portion of the DAV namespace? At present, the specification seems to imply two things (a) that "/" might be a typical arbiter, and (b) that other arbiters can exist and you can get redirected to them. If this issue isn't addressed in the specification, it might lead to clients having hard-coded search arbiter locations, thus forcing servers to put an arbiter at those locations or be non-interoperable. Or, it will require clients to be configured with the search arbiter location, which also seems bad. It seems far better to have a predefined mechanism which clients can use to discover the location of the search arbiter. One simple mechanism would be to define a property on each collection (but not each resource) which gives the location(s) of appropriate arbiters. * How would a client use the results from a query schema discovery? Is the expectation that a client will first perform a QSD before they issue their first query against a given scope? A section discussing this topic would be helpful. * In Section 5.2, is it an error that a search can only have a single scope, or is it intentional that a search only have a single scope? * In section 5.4.1, allowing relative URIs doesn't seem to be particularly compelling, since a search arbiter would not, I expect, be in the same part of the namespace as the content being searched. * Section 5.6 should have a separate section, with a separate heading, for the description of ascending and descending. I had a hard time finding these descriptions without this section heading. * In Section 5.10, what is a literal value? Also, exactly how does the xml:space attribute affect DAV:literal. I think this should be spelled out. Also, can a client always put a wildcard pattern (from Section 5.12.1) inside a literal element, or can a client only use the wildcard for a literal inside a DAV:like. If the latter, then perhaps some element other than DAV:literal should be used, since it seems to be bad practise to have the semantics of an element vary depending on whether it is enclosed by another element. * The BNF for a wildcard permits the entry of "</d:literal>" which would confuse parsers. Also, the BNF sequence for text should use characters instead of octets, to better handle multi-octet character set representations (like UTF-16). * Section 5.9: A non-native English speaker might not map "lt" to less than, "lte" to less than or equal, etc. These need to be spelled out -- for example, do they only apply to numbers? If so, what is a number? Since gt and lt are used by the sort orders ascending and descening, it also appears they apply to strings as well. I suspect their definitions will not be trivial once i18n is considered. * Section 5.13 should discuss the case where a server receives a query in UTF-16, but the resources being searched are stored in UCS-2, UTF-8, etc. Seems that cannonicalizing to UCS-4 internally (at least logically) might be a way to go. At the very least, using XML means that this could happen, and server implementors should be made aware that this can happen, and perhaps given some guidance on how to address the issue. * Section 5.16: this section needs to give guidance on how case sensitivity is handled in non-latin character sets. * Section 5.18: I'm assuming that the reason DAV:iscollection exists is because doing a search for DAV:resourcetype equal to DAV:collection would be too expensive. Perhaps this should be mentioned in Section 5.18. * I had a hard time following the discussion in section 5.19.2. Perhaps if the paragraph started by stating at a high level what a DAV:propdesc does, perhaps in terms of how a client would use it. I didn't find the first sentence to be that helpful. * Perhaps instead of the huge URN in Section 5.19.3, a shorter URI could be used, such as "DAV:/DASL/datatypes/". While WebDAV has not been too concerned with message size, using a GUID URN doesn't appear to be justified here, and it sure is looong. * The types in 5.19.3 are underspecified. Some areas which need improvement: - It needs to be made explicit that these types will appear as XML elements, and every XML element should have a DTD entry for it in the spec. - There should be a BNF description for each data type. This is especially necessary for the float and datetime types. - A string should probably be a triple of contents, character set encoding, and natural language (hmm, well perhaps the character set encoding doesn't have to be listed here, but natural language should be present.) * I think this specification would be greatly strengthened by adding a few examples which perform queries from the scenarios document. Some I would be interested in seeing: - Scenario 2.2.3, "Finding a specific resource by author and date range" - Scenario 2.2.4, "Finding a specific resource using both content and property search" - I'd also be interested in an example where a search was submitted in a non-latin character set, and the results come back sorted according to the rules of a non-latin character set. * The Internationalization Considerations section can use some improvement. Here are a few issues which need to be addressed: - The DASL spec. needs to make some policy statement about sort order in non-latin character sets, if only to give server implementors some kind of hint as to how they should handle this case. There must be some books/standards available which address this issue, so they should be mentioned and referenced. - Some text on handling of character sets would be helpful. For example, I suspect DASL wants to limit the valid character set encodings to just ISO 10646 variants. This allows all character sets to be mapped back to their canonical ISO 10646 values. This section should explicitly note that a query might be submitted in a different character set than the properties or content of the resource. - How do string equality and the language tag interact? It isn't OK to just fail a search if the language tags are different, since a search submitted in en-us might match a en-uk string. - In a submitted query, where is it valid to have an xml:lang attribute? * In the Security Considerations section, the XML security considerations should be copied in from the WebDAV specification. * The Security Considerations section should explicitly mention that there might be privacy risks associated with queries, especially queries which require a user to first authenticate themselves. For example, you might not want someone else to know you're searching for patents on X. --------------- Minor comments: --------------- The dashed lists in sections 1.1 and 1.5 are not indented enough. The references for WebDAV, XML Namespaces, and DASL Requirements all need to be updated. Section 2.2.1, first sentence: remove "per se" -- the sentence is clear without these latin words In Section 2.4.2, the response resource should have a trailing slash after "siamsiam.com". Section 2.6.6 defines the redirectarbiter element, but doesn't specify that its contents must be a URL. Section 3, first sentence "by a resource" --> "by a search arbiter resource" Section 3.2, the Coded-URL production is defined in Section 9.4 of RFC 2518 Section 3.3: Since the DAV:basicsearch must be supported by all implementations of SEARCH, the example in 3.3 should list the DAV:basicsearch URI in one of the DASL headers. Section 4.1.1: The example should be complete, instead of having a natural language forward pointer to section 5.19.9 within the basicsearchschema element. Section 5.1, second paragraph: Perhaps a dashed list giving the element name and its contents would be easier to read. Section 5.3 should explicitly state that the result record is a set of properties. Section 5.4, for completeness describe the semantics of DAV:depth of 0. Section 5.4: there is a spurious "8.5.1" at the end of the paragraph. Section 5.6: ANSI SQL should be added to the list of references, putting it in a "Non-normative References" section. Section 5.11: "on a resource on a resource" --> "on a resource" Section 5.19.2: "provide a hints" --> "provide hints" Security Considerations: "Server should prepare" --> "A server should prepare" Section 14: you may wish to update Jim Davis' contact information. - Jim
Received on Monday, 26 April 1999 17:32:19 UTC