- From: Thomas Bandholtz <thomas@bandholtz.info>
- Date: Fri, 30 Apr 2004 22:01:18 +0200
- To: "Miles, AJ (Alistair) " <A.J.Miles@rl.ac.uk>, "'Jan Grant'" <Jan.Grant@bristol.ac.uk>
- Cc: <public-esw-thes@w3.org>, <public-esw@w3.org>
Alistair > One thing a GET request for the thesaurus URI should definitely return is a > description of that thesaurus (i.e. name, version, creators, description of > scope and content etc.) although again whether that should be machine or > human readable is open. Obviously *both* must be possible.We had megabytes of discussion on this in the Topic Map community and elsewhere. The answer is very simple - there are humans, and there are machines (software agents). A service may decide to serve only one of them, but she may decide to serve both. She may even decide to serve several machine protocols or several human readable layouts. The consequence is that a singe URL is not enough. We need pairs of protocol-URL such as HTML -> http://human.blah.org/thesaurus.html WSDL -> http://services.blah.org/thesaurus.wsdl DCMI -> http://dcmi.blah.org/thesaurus.xml etc., etc., these must be explictly *pairs* (protocol -> URL) as the domain name must not contain any significant meaning itself (see RFC URI) Alistair > The other question is, should the request for the thesaurus URI also return > the entire content of the thesaurus? Personally I think no, but again I'm > not sure about that. It never should by default!! A well established thesaurus easily counts 100.000s and more concepts! The requester (be it human or machine) must be able to identify the thesaurus source without downloading the whole thing. I my personal vision, the "whole thing" *never* will be downloaded at once: avoid redunancy, and what the hell are we doing here? --- We are establish means to *link to specific* concepts and make clear where the come from. Downloading and so duplicating a thesaurus is OK in some situations, but this should be regarded as a very special use case. Thomas
Received on Friday, 30 April 2004 16:01:42 UTC