- From: <Patrick.Stickler@nokia.com>
- Date: Tue, 4 Feb 2003 13:02:02 +0200
- To: <paul@prescod.net>, <sandro@w3.org>, <www-tag@w3.org>
> > Google or a few well-run directory services would provide > > documentation links, and could actually lead to better updated & > > maintained documentation after the term-coiner has lost > interest (and > > his domain name :-) > > Sure, we could depend on registry services if we want the > Semantic Web > to be a centralized rather than decentralized system. Personally, I > think that that would be to misunderstand the very strength > of the Web. I agree with Paul here. I think that what is needed, as I suggested before, is to extend the Web architecture to explicitly differentiate between access of representations of resources from access of descriptions of resources, as these are very different things. Specifically, to add to HTTP something akin to the following: New Methods: MGET Retrieve RDF/XML instance containing all information known about resource MPUT Add all statements expressed in the input RDF/XML instance to resource metadata MDELETE Delete all statements expressed in the input RDF/XML instance from resource metadata New Response Codes: 600 Unknown resource 601 No information available about the known resource 602 RDF/XML instance containing all information known about resource Knowledge about a resource is not a representation of that resource (though this is very hard to defend given the vague and informal definition of what a representation actually is and can be in terms of the inherent qualities of the resource) so one must be able to access this knowledge by some means other than HTTP GET/PUT/POST, all of which deal specifically with representations. We can agree that an http: URI denotes a resource, any resource, whether or not it has any web-accessible representations. The HTTP/REST interpretation of that http: URI per GET/PUT/POST is to interact with and manipulate one or more representations of the resource. The (proposed) HTTP/SW interpretation of that URI per the new MGET/MPUT/MPOST extensions is to interact with and manipulate knowledge about the resource. In neither case does the denotation of the http: URI differ or become ambiguous. It always denotes the resource. We are simply defining a means by which HTTP can allow access of either representations or knowledge, in terms of that resource. Being able to differentiate between representations of versus knowledge about a given resource resolves the ambiguity which impacts the SW application of http: URIs per HTTP/REST as it is now defined. Now, some applications may wish to explicitly identify each representation, or the body of knowledge about a resource, but again, that does not affect the denotation of the resource itself by the http: URI in question. In fact, obtaining first the knowledge about the resource may very well help the application determine *which* representation it prefers and enable it to ask for that representation by name per standard content negotiation mechanisms. It could also tell it which, if any, representation can be considered canonical, in the case of digital resources for which bit-equal copies can be obtained. Thus: a resource ^ | (denotes) | | http://example.com/foo | | | | | (Web Interpretation) | | (SW Interpretation) | | | | v | [Representation+]? v [RDF Graph]? And either the SW interpretation or Web interpretation would be optional -- as there might exist representations without metadata and metadata without representations. A 404 response to GET simply means that no representations are available. A 601 response to MGET would mean that no knowledge is available. The above solution also clarifies the role of fragment IDs per either Web or SW interpretations. The Web interpretation of a fragment ID is denoting a subcomponent of the resource denoted by the URI and in terms of Web access, as an internal addressing mechanism specific to the MIME encoding of a particular representation. The SW interpretation of a fragment ID is denoting a subcomponent of the resource denoted by the URI, the nature of that subcomponent and its actual relation to the superordinate resource described in the metadata associated with and accessible in terms of that URIref. Thus, if http://example.com/myBook denotes a book (a work) and http://example.com/myBook#chapter1 denotes a logical subcomponent of that book (a chapter). Then the Web interpretation of GET for http://example.com/myBook#chapter1 is to obtain a representation per http://example.com/myBook and the requesting agent focus on the internal component of the representation identified by #chapter1, per the MIME type of the representation. The SW interpretation of MGET for http://example.com/myBook#chapter1 is to return an RDF/XML instance containing all statements where http://example.com/myBook#chapter1 is the subject. If we are dealing with resources which will never concievably have any representation or constitute an addressable component in the representation of some superordinate resource, then there is no need to use a URIref, and all will still work fine. Thus, whether to use fragment IDs and create URIrefs is left up to the content owner (of both representation and knowledge) and the actual extended HTTP architecture functions the same either way, since in the case of GET, HTTP doesn't actually concern itself with the fragment ID and in the case of MGET, all URIs (including URIrefs) are opaque. Thus any http: URI can denote any resource, and one can interact with either representations of that resource or knowledge about that resource, or both, and the machinery for interacting with representations and knowledge remains non-centralized, distributed, and scalable, just as the present Web is. And current crawlers could be converted to be SW crawlers simply by changing GET to MGET, and harvesting the returned knowledge which would be precise rather than having to analyze representations and guess. Caching machinery would possibly need to be extended/optimized to deal with partial changes to metadata knowledge, but the overall set of proven web caching methods should apply (or alternately just don't cache any metadata). > ... You can't know whether > "http://www.prescod.net" refers to > "Paul's homepage", "The Prescod Family Homepage", "Paul's business", "A > set of links endorsed by Paul" etc. unless I tell you. Exactly. Not with the present HTTP architecture, at least. Yet the above solution would provide you a means to tell us. Just define that knowledge for the resource on the same server that provides access to the representations of that resource, and agents can then inquire about the resource and know exactly what it is and what the representations portray, and probably which representation, if any, is optimal for that agent's needs, if it even needs a representation after getting all the knowledge about the resource (maybe all the agent wants is knowledge, not representations). >... we can argue about each and every one to figure out WHAT > concept it is about. Is [1] a web page, a corporation, a kind of car, a > family of cars, a family of car companies??? Only the owner of the URI > can answer the question. They can only answer it in a > machine-processable way with a machine-processable syntax like RDF. So > why not put up an RDF document that answers the question? And heck, why > not use that RDFs URI to represent the "thing". Precisely. But to do so in a scalable and intuitive manner, it requires extending the functionality of HTTP to explicitly provide for the needs of SW-relevant knowledge independent of representations. It must be possible to define and provide that RDF without calling it a representation, which it is not. The above solution, by extending HTTP, allows existing Web applications to remain unchanged, allows the current REST concepts of resource and representation to remain unchanged, and yet allows SW agents to obtain the knowledge they need about resources using the very same URIs that denote those resources, without being confused by the Web-specific layer of representations. The URI is the point of intersection between the Web and Semantic Web. A URI denotes a resource (any arbitrary resource). The Web provides access to representations. The SW provides access to knowledge. The extended HTTP design addresses both equally well, without confusion or conflict, in a globally distributed, non-centralized, scalable manner. Problem solved. Back to the fun stuff... Patrick -- Patrick Stickler, Nokia/Finland, (+358 40) 801 9690, patrick.stickler@nokia.com
Received on Tuesday, 4 February 2003 06:02:08 UTC