- From: Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk>
- Date: Mon, 9 Aug 2004 16:31:00 +0100
- To: "'public-esw-thes@w3.org'" <public-esw-thes@w3.org>
Because this is a thorny issue, I've started trying to write up a discussion of the issues and options on the wiki at <http://esw.w3.org/topic/SkosDev/SkosCore/CollectionsAndArrays>. Here is a copy of the discussion section so far ... --------------------------------------------------- 2. Discussion To represent the essential features of an 'array' in RDF there are two main options: 'Collections' and 'Containers'. The examples below reference the following example concepts ... <rdf:RDF xml:base="http://example.org/"> <skos:Concept rdf:about="A"/> <skos:prefLabel>armchairs</skos:prefLabel> </skos:Concept> <skos:Concept rdf:about="B"/> <skos:prefLabel>ax chairs</skos:prefLabel> </skos:Concept> <skos:Concept rdf:about="C"/> <skos:prefLabel>back stools</skos:prefLabel> </skos:Concept> </rdf:RDF> 2.1 Option A: Collections A possible representation of an 'array' using RDF collections is below (assuming standard namespace prefixes) ... <rdf:RDF xml:base="http://example.org/"> <skos:Collection> <rdfs:label>chairs by form</rdfs:label> <skos:members rdf:parseType="Collection"> <skos:Concept rdf:about="A"/> <skos:Concept rdf:about="B"/> <skos:Concept rdf:about="C"/> </skos:members> </skos:Collection> </rdf:RDF> 2.2 Option B: Containers A possible representation of an 'array' using RDF containers is below (assuming standard namespace prefixes) ... <rdf:RDF xml:base="http://example.org/"> <rdf:Seq> <rdfs:label>chairs by form</rdfs:label> <rdf:li rdf:resource="A"/> <rdf:li rdf:resource="B"/> <rdf:li rdf:resource="C"/> <rdf:li rdf:resource="D"/> </rdf:Seq> </rdf:RDF> 2.3 Pros and Cons Collections tend to be preferred over containers for several reasons (see e.g. this email <http://lists.w3.org/Archives/Public/www-rdf-interest/2003Nov/0082.html> and follow up on same thread). (See also David Menedez's email to public-esw-thes@w3.org earlier this year <http://lists.w3.org/Archives/Public/public-esw-thes/2004May/0081.html>) Here follows some scenarios that might help evaluate which of these options is the best starting point ... AJM> RDF gurus if I have got any of this wrong, please correct me 2.3.1 Scenario: given an array, obtain its members using an RDF query language (e.g. RDQL) RDF collections are an absolute pain to query. If the length of the list is not known, then one query has to be applied for each of the list members until the rdf:nil is met. If there is a network latency to factor in for each query, there are obvious practical implications. An option to overcome this would be to express the length of the list in an additional statement, e.g. ... <rdf:RDF xml:base="http://example.org/"> <skos:Collection> <rdfs:label>chairs by form</rdfs:label> <skos:members rdf:parseType="Collection"> <skos:Concept rdf:about="A"/> <skos:Concept rdf:about="B"/> <skos:Concept rdf:about="C"/> </skos:members> <skos:length rdf:datatype="http://www.w3.org/2001/XMLSchema#int">3</skos:length> </skos:Collection> </rdf:RDF> ... so with the length known, all the members of the list can be obtained in a single RDF query. This might seem a bit silly, but it is an obvious pragmatic solution to a tricky problem. RDF containers are easier to query, provided that the RDF repository has some basic inferencing capabilities, because the container membership super-property rdfs:member can be used. However, without any inferencing, containers run into the same problem as collections in that the length must be known a priori in order for the members to be obtained in a single query. 2.3.2 Scenario: given a concept, obtain any arrays of which it is a member using an RDF query language Where RDF collections have been used to describe arrays, this is impossible to do. A workaround would be to add a statement about the concept, e.g. ... <rdf:RDF xml:base="http://example.org/"> <skos:Collection rdf:about="C1"> <rdfs:label>chairs by form</rdfs:label> <skos:members rdf:parseType="Collection"> <skos:Concept rdf:about="A"/> <skos:Concept rdf:about="B"/> <skos:Concept rdf:about="C"/> </skos:members> <skos:length rdf:datatype="http://www.w3.org/2001/XMLSchema#int">3</skos:length> </skos:Collection> <skos:Concept rdf:about="A"> <skos:inCollection rdf:resource="C1"/> <skos:Concept> <!-- ... and so on for other concepts. --> </rdf:RDF> The main problem with the hypothetical skos:length and skos:inCollection properties is that they introduce logical dependencies between statements that must be maintained by any programs modifying the structure. In other words, conflicting statements could be accidentally introduced. Where RDF containers have been used to describe arrays, this is possible to do via the rdfs:member, again provided that the repository has some inference capability. If there is no inference it is impossible, unless a workaround such as the one suggested above is used. --- Alistair Miles Research Associate CCLRC - Rutherford Appleton Laboratory Building R1 Room 1.60 Fermi Avenue Chilton Didcot Oxfordshire OX11 0QX United Kingdom Email: a.j.miles@rl.ac.uk Tel: +44 (0)1235 445440 > -----Original Message----- > From: public-esw-thes-request@w3.org > [mailto:public-esw-thes-request@w3.org]On Behalf Of Miles, AJ > (Alistair) > > Sent: 09 August 2004 15:28 > To: 'public-esw-thes@w3.org' > Subject: [Requirement][SKOS-Core] Arrays of concepts > > > > Hi all, > > We made a good start on this issue earlier in the year, here > is a write up > of the specific requirement ... > > [see also > <http://esw.w3.org/topic/SkosDev/SkosCore/CollectionsAndArrays>] > > Many thesauri group small sets of concepts under what's called a 'node > label' or 'guide term', for example this from the AAT ... > > chairs > <chairs by form> > armchairs > ax chairs > backstools > Barcelona chairs > barrel chairs > ... > > ... or this from the English Heritage thesaurus of historic > aircraft ... > > AIRCRAFT > AIRCRAFT <BY FUNCTION> > TEST AIRCRAFT > FIGHTER > BOMBER > TRAINER > TRANSPORTER > RECONNAISSANCE > TARGET > ARMY COOPERATION > TUG > > This type of collection of concepts is commonly called an > 'array', where the > array label identifies some 'characteristic of division' for > the contents of > that array. > > The consensus seems to be that the node label (i.e. 'chairs > by form' or > 'aircraft by function') should not be modelled as a label for > a concept in > its own right, but rather as a label for a collection of concepts. > > The matter is complicated further because in some arrays, the > ordering of > concepts is meaningful. However, in other arrays the ordering > of concepts is > not meaningful. The RDF description of an 'array' must > therefore provide a > way to distinguish between these two cases, primarily so that > applications > handling the data can know whether they should preserve the original > ordering, or whether they are free to reorder the contents of > an array by > some criterion, for example alphabetically. > > SKOS-Core requires some framework for supporting arrays of concepts as > described here. > > > > --- > Alistair Miles > Research Associate > CCLRC - Rutherford Appleton Laboratory > Building R1 Room 1.60 > Fermi Avenue > Chilton > Didcot > Oxfordshire OX11 0QX > United Kingdom > Email: a.j.miles@rl.ac.uk > Tel: +44 (0)1235 445440 > > >
Received on Monday, 9 August 2004 15:31:33 UTC