- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Wed, 2 Jul 2003 14:20:44 +0300
- To: "ext Paul Prescod" <paul@prescod.net>
- Cc: <www-tag@w3.org>
----- Original Message ----- From: "ext Paul Prescod" <paul@prescod.net> To: "Patrick Stickler" <patrick.stickler@nokia.com> Cc: <www-tag@w3.org> Sent: 02 July, 2003 02:30 Subject: Re: URIQA > Patrick Stickler wrote: > > > > ... > > > > This works for GET, but not for PUT or DELETE. > > > > An early stage of URIQA development actually defined special MIME types > > for concise bounded descriptions (in an attempt to try to accomplish what > > was needed without any extensions to the present Web architecture) -- but > > ambiguit arises when performing e.g. a PUT because the behavior of the > > server differs depending on whether the input is a > > representation or description. > > I'm not sure that the definition of PUT is that clear. Let's say you > have a resource representing a "white paper" with representations in > XML, HTML and PDF. Does a PUT to PDF necessarily obliterate the XML? Or > does it just replace the PDF rendition? Ultimately, it's up to the server, to determine which representation, if any, or all, is replaced by a new representation being PUT onto the server. In that regard, you are correct that it is somewhat unclear what a given server might do with regards to multiple representations of a resource. But that is beside the point (or perhaps precisely the point of URIQA ;-) The key issue is that when PUTting knowledge, as opposed to a representation, one is adding to a single body of knowledge, not entirely replacing that body of knowledge with the input. The fact that one might also interact with a description of a resource as a kind of representation, using traditional Web methods, is simply an added extra, but not central to the fundamental SW behavior defined by URIQA. So whether the server supports conneg or not, whether the server is able to maintain multiple representations or not, if the server is dealing with representations, the server is *monolithically* replacing one representation for another. It's the issue of monolithic replacement versus modification that matters here. > The safest thing is to have a separate URI for the PDF rendition and PUT > that. I agree. But again, I think you are missing the essential problem. I'll try to be clearer. Let's say we use PUT to update entire descriptions of resources managed individually as RDF/XML instances -- where the entire body of knowledge known about that resource is contained in that RDF/XML instance. Let's use as the URI of the resource http://example.com/someResource and of the RDF/XML instance containing the concise bounded description of that resource http://example.com/someResource.rdf We also define a special MIME type application/rdf+xml+uriqa corresponding to a URIQA Concise Bounded Description encoded in RDF/XML. So, any of the following should allow us to store a new revision of the complete description contained in an RDF/XML instance. PUT http://example.com/someResource.rdf HTTP/1.1 PUT http://example.com/someResource.rdf HTTP/1.1 Content-Type: application/rdf+xml PUT http://example.com/someResource.rdf HTTP/1.1 Content-Type: application/rdf+xml+uriqa and using conneg, with the necessary MIME type to suffix bindings, the following also accomplish the same PUT http://example.com/someResource HTTP/1.1 Content-Type: application/rdf+xml PUT http://example.com/someResource HTTP/1.1 Content-Type: application/rdf+xml+uriqa Note that either application/rdf+xml and application/rdf+xml+uriqa is valid since the input content conforms to both MIME types, the latter being a specialization of the former, just as many XML encodings with distinct MIME types are specializations of text/xml and all use the suffix '.xml'. Since conneg can be used in conjunction with the more general URI denoting the resource in question rather than its representation, the MIME type cannot serve as a flag to indicate the shift in behavior between dealing with representations versus descriptions. Or at best, the conneg model has to be changed to give special meaning to certain MIME types so that it doesn't get in the way of the SW behavior (hardly a good idea). Now, we also have some knowledge about the RDF/XML instance itself such as the owner, title, creation date, status, etc. How do we indicate PUTing knowledge about the representation, if that knowledge is also encoded as the same MIME type as the representation? I.e. does PUT http://example.com/someResource.rdf HTTP/1.1 Content-Type: application/rdf+xml+uriqa mean to completely replace the presently stored representation with a new version or to update the body of knowledge about the representation with the statements in the input? The server can't know. It's completely ambiguous. What if our input only adds a single statement to the total description of the RDF/XML instance? We'd end up loosing all the other knowledge about that instance! Now, you might say, just use yet another URI to denote the description about the representation which is a description about the resource. E.g. http://example.com/someResource.rdf.rdf A major problem with that (and there are several others) is it still precludes partial modification of any given description and forces an application to first (a) lock the body of knowledge, (b) check out the full body of knowledge about a resource, (c) modify the body of knowledge accordingly, (d) commit the new complete body of knowledge to the server, and (e) unlock the body of knowledge. While this might work for some trivial scenarios, there are many where it does not work, particularly when access to the complete body of knowledge is multileveled, where not all users have access to all knowledge but still must be able to add/modify/delete that portion of knowledge that they do have access rights to. Not to mention that it is typical and advisable practice to capture knowledge about multiple resources in the same RDF/XML instance, yet no'one wants to GET an RDF/XML instance describing thousands of resources just to get a description of a single resource. Only by allowing for both resource-specific as well as partial access/modifications to managed knowledge of resources can anything even closely resembling a global world wide semantic web of knowledge interchange succeed. Monolithic, file based views of knowledge storage and interaction simply cannot meet the scalability and efficiency needs of the SW, which is why we need a solution such as URIQA which allows one to interact with (frequently partial) bodies of knowledge about specific resources rather than merely files. Yours (and other's) suggestions that SW agents interact with knowledge in terms of monolithic files is just as unworkable as suggesting that folks interact with RDMS data in terms of complete databases, much less even complete tables. It just won't work. With regards to the nature and interaction of content, the fundamental character of the Web and SW are very different, even if we can get them to share a common infrastructure and set of resource identifiers, and extensions to the current Web architecture are necessary in order to capture and exploit these (complementary) differences effectively. > As far as DELETE, I don't understand why you would DELETE (as opposed to > clear) the description of an object. Either the resource exists or it > does not. If nothing is known about it then it should have an empty > description. But the description doesn't have to be deleted. I may want to delete a single particular statement about that resource, but not want to delete the entire body of knowledge known about that resource stored in the repository. It is similar (though not identical) to deleting a particular element of an XML instance without deleting the entire XML instance. That comparison is of course grossly imperfect because XML was not designed to allow for such micromanagement of internal content -- though it can be and is done, but usually "cheating" with relational databases ;-) and not with files. XML is all about static structures, and while one certainly can manipulate subcomponents of an XML instance, one does not easily manage XML encoded knowledge on an element by element basis, such that PUT and GET are operating on individual elements of an XML instance. One must deal with entire XML instances, or at best, fragments. For RDF, on the other hand, its XML encoding is just a means for interchange and is a means to an end, that end being an RDF graph via which one can interact with individual statements or sets of statements about resources in a highly effective manner, including PUTting, GETting, and DELETEing subsets of that graph irrespective of any RDF/XML serializations (files) that might be used to otherwise interchange, archive, or modify that knowledge. The SW needs to be able to operate in terms of bodies of knowledge, not files, and SW server behavior must provide an efficient means of working with bodies of knowledge rather than files. And SW agents should be able to consistently interact with bodies of knowledge irrespective of how that knowledge is maintained on a given server. Those descriptions *might* be managed as monolithic RDF/XML instances using GET and PUT to edit them. But they might also be managed via a proprietary database interface specific to that server. The agent need not have to worry about that. It shouldn't have to know how each server stores its descriptions and have to be able to GET, modify, and PUT those server-specific representations in order to interact with resource descriptions. > >... > > To do this, the best solution that I've been able to come up with which > > requires the least modification to the existing Web architecture is a single > > header (serving as a flag) which allows us to differentiate between dealing > > with representations from dealing with descriptions. > > This is actually quite a big change to web architecture. Err. It seems to follow the most politically correct and recommended method of extending the present web architecture, specifically by using headers. In fact, the only reason it uses headers is because of all the heat and resistence I got to my proposals based on a new set of methods. So much for trying to be politically correct... Still, the extensions needed for SW behavior are far more important and far reaching than most application-specific extensions, and to that end, should be accommodated in ways that would not normally be encouraged for all extensions. > It will confuse > vast amounts of technology like caches. Eh? Do caches discard request headers? If so, then you may have a point. In fact, if caching discards the header distinguishing between a request relating to a representation from a request relating to a description, then I would consider the header approach presently taken by URIQA to be unworkable. Note that that does *not* mean URIQA is unworkable. The URIQA model is not just the addition of a new header, but a model for SW enabled server behavior and the "flag" by which the server is triggered to process a request in terms of descriptions rather than representations is a minor point to the overall model. My preferred solution has long been to use three new methods, MGET, MPUT, and MDELETE, rather than the header approach where the 'M' prefix serves the same role as the header, acting as a flag to indicate the SW behavior of the operation. I.e. the following would be semantically equivalent: MGET = GET + URI-Resolution-Mode: Description MPUT = PUT + URI-Resolution-Mode: Description MDELETE = DELETE + URI-Resolution-Mode: Description The key benefit to these new methods over the header approach is that a client has more reliable feedback whether the server is or is not SW enabled. E.g. if a server doesn't understand the MPUT method, it barfs. Whereas if it doesn't understand the URI-Resolution-Mode: Description header sent along with PUT, it might completely replace a representation with the description input rather than update the description of the resource. And even though the URIQA spec requires a SW enabled server to return a header indicating that it understood the request in SW terms, realization of the erroneous action of a non-SW enabled server only comes after the act, and there still remains the question of whether the server actually is SW enabled, did the right thing, but simply failed to include the header in the response indicating that all is well. The header approach is not an example of optimal engineering design, and IMO seems more like a hack than a proper extension of the Web architecture. But it does work. If certain folks weren't so thoroughly opposed to new methods, I would have adopted the M* methods for URIQA. The header approach is an entirely political compromise. I've come to terms with the header approach based on the view that, anyone that has the right to PUT to and DELETE from a server is known to the server and typically bound to certain usage constraints and also will know the server and whether it is or is not SW enabled, so in practice it should not be a huge problem, just an inconvenience and an ugly aspect of the header based solution. Do caches also discard the request method? If not, then IMO that would constitute a deciding argument in favor of the use of the new methods rather than the header approach. If caches discard both headers and the request method, then clearly caching will be a major obstacle to tight integration of Web and SW behavior which will have to be addressed. And if that is the case, then I think that storing the method of the request would be a much smaller and elegant fix than having to store all (or even worse, specific) headers. > > ... > > I agree, and if you look at the behavior of the reference implementation of > > URIQA (and this is also noted in the URIQA spec) all concise bounded > > descriptions are distinct resources in their own right, and are given > > distinct URIs. > > Then that's where you should PUT. No. This still does not allow for interaction with subsets of knowledge. A description/representation unique URI also doesn't work for GET, by itself. Because a client won't know what URI denotes a particular description until it is returned. If you don't know what the URI of the description is, how can you GET it. If all I have is a URI http://example.com/someResource that denotes, er, some resource, and I want a description of it, how do I find out what the URI is that denotes the description of that resource? And should I need to? A key assertion at the heart of the URIQA model is the following: A resource is denoted by a URI, and that URI should be all that a client needs to obtain either a representation or a description of that resource, in a single server request. A client should not have to first execute a HEAD or GET to obtain the URI of a description, in order to then do a subsequent GET to obtain that description. This imposes a two-step process for the most fundamental operation on the SW and makes SW agents second class citizens of the Web. The Web/SW architecture, if it is to share a common core/foundation, must provide for one-step access to resource descriptions. -- My own view of how the Web and SW architectures interrelate is as follows: The traditional Web can be seen as a set of representations, interrelated by resource references. The SW can be seen as a set of descriptions, interrelated by resource references. Their intersection is the set of URIs which denote resources, for which there are both representations and descriptions. HTTP+URIQA provides a standardized means to provide global access to both representations and descriptions using a common infrastructure based solely on URIs based on a URI scheme that is meaningful to HTTP servers. To that end, the complimentary methods GET/MGET, PUT/MPUT, and DELETE/MDELETE more clearly and elegantly reflect how the same protocol can be used to navigate the Web and SW respectively. The header approach tends to hide this fundamental distinction. > > This also allows one to use PUT/DELETE to interact with those distinct > > representations, including the use of conneg, in a traditional fashion. > > So you've solved the problem without the need for those headers! No. I haven't. As the above explainations hopefully now make clear. PUT/DELETE on their own do not provide a means to interchange/manage bodies of knowledge about a resource which may be subsets of the complete body of knowledge about that resource maintained by a given server. We need something extra to do that. Patrick patrick.stickler@nokia.com
Received on Wednesday, 2 July 2003 07:21:11 UTC