- From: Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk>
- Date: Wed, 25 Feb 2004 12:06:03 -0000
- To: "'public-esw-thes@w3.org'" <public-esw-thes@w3.org>
Forwarding this to the list, feedback from Stella Dextre-Clarke, who is involved in drafting of new British Standards for thesauri. > -----Original Message----- > From: Stella Dextre Clarke [mailto:sdclarke@lukehouse.demon.co.uk] > Sent: 12 January 2004 09:50 > To: Miles, AJ (Alistair) > Cc: Leonard Will (Leonard Will); 'Alan Gilchrist' > Subject: RE: SWAD-Europe Thesaurus activity > > > Alistair, > Please accept my apologies for the long delay in replying. > First of all I was too tied-up with other things; then I > thought I'd wait until after our standards Working Group had > its meeting (6 Jan) and send you a joint response. WE did > have that meeting, and the good news is, we made a lot of > progress with all the corrections BSI has made to our drafts, > to get them into BSI housestyle. We now expect the documents > (i.e. Parts 1 and > 2) to emerge in March as Drafts for Public Comment. > > The bad news is, Parts 1 and 2 took up most of the day and we > did not have time for the Group to consider the SWAD papers > properly. So I will just try to give you a few personal > comments on the work in progress. > > Firstly, it is very impressive to see how much is being done > - keep up the good work! > > Re the SKOS-mapping document, I liked the general approach, > which has a lot in common with our draft of Part 4 of the > standard ( this is the Part that deals with mappings). Some > matters of detail may need sorting out. For example, the > property "mappingRelation" seems to be defined (or at least > described) in terms of itself. In our standard, by the way, > we differentiate between inter-term "mappings" and > "relationships" by using the former term for relationships > between terms in different vocabularies. (Thus all mappings > are relationships, but we try to use the term "relationships" > when they apply within one vocabulary and "mappings" for > cross-vocabulary relationships. What we want to avoid is the > sort of loose chatter where people talk about a mapping when > all they mean is a USE/UF relationship inside one thesaurus.) > > I thought that specifying "more than 50%" or "less than 50%" > (in a set of indexed resources) as the distinction between > major and minor matches has the benefit of pragmatism (i.e. I > like it in principle) but some problems in practice. It can > only apply in the context of a particular indexed collection, > and the benefit is that you get a measure of how good the > mapping will be for that collection. But a problem arises > when the collection grows, and something that matched for 80% > of the resources initially, now only matches for 30% of the > resources. It means you have to make regular checks on all > the major/minor matches to see if they are still valid - even > though the concepts themselves have not changed. > > RE the SKOS-Core document, this seems to be setting up > definitions for a series of terms, and I am a little > concerned that the terms/definitions being established in > your group may differ from those in our standard, which we > hope will be adopted internationally (in the longer run). In > some cases the definitions are compatible with each other; in > other cases there is a real difference of usage. For > example, I am not sure I have understood the difference made > in the SWAD document between the property "prefLabel" and the > property "descriptor", since the former property seems to be > exactly what our standard means by "descriptor" i.e. the > unique name by which a concept should be labelled. We use the > term "non-descriptor" for any alternative (non-preferred) > name for the same concept. To take one of the examples in the > SWAD document, "Orange (fruit)" could be a descriptor or a > non-descriptor, depending on how it is established in the > thesaurus. Spelling this out a little, in Thesaurus A, we > might have an entry "Orange (fruit) BT Citrus fruits", > indicating that both of these terms are descriptors. In > Thesaurus B we might have an entry "Orange (fruit) USE > Oranges", indicating that the former term is a > non-descriptor. It goes without saying that all the terms in > a thesaurus, whether descriptors or non-descriptors, have to > be unique. I was not quite clear, studying the SWAD document, > whether "descriptor" could also be used for the things that > our standard calls "non-descriptors" - which would be > unfortunate! Sorry I have made rather a meal of this > example, but I am just wondering how we could proceed so that > there are no real incompatibilities between the terminologies > used in the SWAD work, and those in the thesaurus standard. > > Incidentally, I hope to have a cleaned-up version of our > definitions in the next few days. Would you like a copy? (The > difference between them and those in the draft I sent you > before are only cosmetic - the application of BSI's > house-style - but still, differences can cause problems if > one is not aware of them.) > > Another thing that concerns me is the class "Facet". The SWAD > document states that "a concept may be a member of only one > facet". I find myself split on this one, because I agree that > ideally, facets should be mutually exclusive. But in > practice, many thesauri which claim to follow the principles > of facet analysis (and this is one of the principles) do not > always achieve the ideal. Some facets commonly used in > thesauri include Activities, Agents, Objects, Materials, > Organisms, Places, Times. Normally, a concept that belongs to > one of these facets cannot belong to any of the others, > because they are such fundamentally different things. But in > practice, a few concepts can occur that it is convenient to > assign to more than one facet. For example, biotechnology has > allowed us to develop some special organisms that may be used > as materials. Sometimes it is arguable whether a given > descriptor represents an object or a material. Or a material > such as a chemical reagent may be thought of as an agent > (although most agents are people or organisations). You could > argue that this problem occurs only because the facets have > been badly chosen in the first place. But I argue ( I am a > pragmatist) that in the real thesauri one encounters in > particular contexts, facets may have been chosen because they > are useful in the given context, and not for their > theoretical properties. Occasionally, therefore, concepts > will crop up that have been assigned to more than one facet. > What I am trying to warn is that, even though the ideal is > still as stated above, practical applications have to be > built in such a way that they will not break when the > exceptions crop up. > > I must stop getting excited about every detail! I should > address your question about a "standard interface for a > thesaurus service". AS you can see in the draft of Part 2 > which I sent you ( I hope you did receive > it?) we do say quite a bit about the functionality required > in the interfaces for (a) using a thesaurus for retrieval, > and (b) maintaining a thesaurus. Is that what you mean? WE do > not specify details of the interface - just the > functionality, and in quite a permissive way, to allow added > features. As to the data exchanges that support the > interface, formats and protocols will be specified in Part 5. > (We have not done any work on Part 5, but we hope it will > reflect the contents of Parts 1-4 and borrow heavily from the > work done by teams such as your own. So the more we can align > work across the community, the better.) > > On another matter, your Links page invites contributions and > I wondered whether you would like to make reference to the > GCL at http://www.govtalk.gov.uk/schemasstandards/gcl.asp > That is the address of the online version. There are copies > freely available for downloading at > http://www.govtalk.gov.uk/schemasstandards/gcldocuments.asp > Strictly speaking, the GCL is a taxonomy rather than a > thesaurus, but I note that the page uses the term "thesauri" > to include quite a lot of other vocabularies (e.g. LCSH, DDC) > that are not thesauri, so I think you could put the GCL in > with the other "thesauri". > > Please do keep in touch, Alistair, and let us know if you see > any opportunities for joint action. > > Best wishes for 2004, > Stella > > ***************************************************** > Stella Dextre Clarke > Information Consultant > Luke House, West Hendred, Wantage, Oxon, OX12 8RR, UK > Tel: 01235-833-298 > Fax: 01235-863-298 > SDClarke@LukeHouse.demon.co.uk > ***************************************************** > > > > -----Original Message----- > From: Miles, AJ (Alistair) [mailto:A.J.Miles@rl.ac.uk] > Sent: 21 November 2003 11:57 > To: Stella Dextre Clarke (E-mail) > Subject: SWAD-Europe Thesaurus activity > > > Hi Stella, > > Just to send you an update on the SWAD-Europe thesaurus work. > The current work is all written up on the web site [1]. > > The RDF formats for thesaurus data are maturing, and there > will be some reports in the next month or so covering things > like representing multilingual data, inter-thesaurus mapping > and thesaurus change and version control. We're also looking > at making interoperability between thesauri and web > ontologies, taxonomies and other KOS happen. > > At the moment we're talking about doing this like defining > the RDF semantic-relations in relation to published standards > (to avoid ambiguities with things like 'broader') so it would > be good to stay in touch with the development of new British > standards for thesaurus structure. > > A last question, we are working on a web service API for a > terminology service. Does your new standard cover things > like a standard interface to a thesaurus service? > > Yours, > > Alistair. > > [1] SWAD-Europe Thesaurus Activity > <http://www.w3c.rl.ac.uk/SWAD/thesaurus.html> > [2] Semantic > Web Advanced Development for Europe project > <http://www.w3.org/2001/sw/Europe/> > > > CCLRC - Rutherford > Appleton Laboratory > Building R1 Room 1.60 > Fermi Avenue > Chilton > Didcot > Oxfordshire OX11 0QX > United Kingdom > > Email: a.j.miles@rl.ac.uk > Telephone: +44 (0)1235 445440 > >
Received on Wednesday, 25 February 2004 07:06:07 UTC