RE: [Requirement][SKOS-Core] Arrays of concepts from Miles, AJ (Alistair) on 2004-08-09 (public-esw-thes@w3.org from August 2004)

From: Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk>
Date: Mon, 9 Aug 2004 16:31:00 +0100
To: "'public-esw-thes@w3.org'" <public-esw-thes@w3.org>
Message-ID: <350DC7048372D31197F200902773DF4C05E50B97@exchange11.rl.ac.uk>
Because this is a thorny issue, I've started trying to write up a discussion
of the issues and options on the wiki at
<http://esw.w3.org/topic/SkosDev/SkosCore/CollectionsAndArrays>.

Here is a copy of the discussion section so far ...
---------------------------------------------------

2. Discussion

To represent the essential features of an 'array' in RDF there are two main
options: 'Collections' and 'Containers'.

The examples below reference the following example concepts ...

<rdf:RDF xml:base="http://example.org/"> 
 
  <skos:Concept rdf:about="A"/> 
    <skos:prefLabel>armchairs</skos:prefLabel> 
  </skos:Concept>   
 
  <skos:Concept rdf:about="B"/> 
    <skos:prefLabel>ax chairs</skos:prefLabel> 
  </skos:Concept>   
 
  <skos:Concept rdf:about="C"/> 
    <skos:prefLabel>back stools</skos:prefLabel> 
  </skos:Concept>   
 
</rdf:RDF> 

2.1 Option A: Collections

A possible representation of an 'array' using RDF collections is below
(assuming standard namespace prefixes) ...

<rdf:RDF xml:base="http://example.org/"> 
 
  <skos:Collection> 
    <rdfs:label>chairs by form</rdfs:label> 
    <skos:members rdf:parseType="Collection"> 
      <skos:Concept rdf:about="A"/> 
      <skos:Concept rdf:about="B"/> 
      <skos:Concept rdf:about="C"/> 
   </skos:members> 
  </skos:Collection> 
 
</rdf:RDF> 

2.2 Option B: Containers

A possible representation of an 'array' using RDF containers is below
(assuming standard namespace prefixes) ...

<rdf:RDF xml:base="http://example.org/"> 
 
  <rdf:Seq> 
    <rdfs:label>chairs by form</rdfs:label> 
    <rdf:li rdf:resource="A"/> 
    <rdf:li rdf:resource="B"/> 
    <rdf:li rdf:resource="C"/> 
    <rdf:li rdf:resource="D"/> 
  </rdf:Seq> 
 
</rdf:RDF> 

2.3 Pros and Cons

Collections tend to be preferred over containers for several reasons (see
e.g. this email
<http://lists.w3.org/Archives/Public/www-rdf-interest/2003Nov/0082.html> and
follow up on same thread).

(See also David Menedez's email to public-esw-thes@w3.org earlier this year
<http://lists.w3.org/Archives/Public/public-esw-thes/2004May/0081.html>)

Here follows some scenarios that might help evaluate which of these options
is the best starting point ...

AJM> RDF gurus if I have got any of this wrong, please correct me

2.3.1 Scenario: given an array, obtain its members using an RDF query
language (e.g. RDQL)

RDF collections are an absolute pain to query. If the length of the list is
not known, then one query has to be applied for each of the list members
until the rdf:nil is met. If there is a network latency to factor in for
each query, there are obvious practical implications. An option to overcome
this would be to express the length of the list in an additional statement,
e.g. ...

<rdf:RDF xml:base="http://example.org/"> 
 
  <skos:Collection> 
    <rdfs:label>chairs by form</rdfs:label> 
    <skos:members rdf:parseType="Collection"> 
      <skos:Concept rdf:about="A"/> 
      <skos:Concept rdf:about="B"/> 
      <skos:Concept rdf:about="C"/> 
    </skos:members> 
    <skos:length
rdf:datatype="http://www.w3.org/2001/XMLSchema#int">3</skos:length> 
  </skos:Collection> 
 
</rdf:RDF> 

... so with the length known, all the members of the list can be obtained in
a single RDF query. This might seem a bit silly, but it is an obvious
pragmatic solution to a tricky problem.

RDF containers are easier to query, provided that the RDF repository has
some basic inferencing capabilities, because the container membership
super-property rdfs:member can be used. However, without any inferencing,
containers run into the same problem as collections in that the length must
be known a priori in order for the members to be obtained in a single query.

2.3.2 Scenario: given a concept, obtain any arrays of which it is a member
using an RDF query language

Where RDF collections have been used to describe arrays, this is impossible
to do. A workaround would be to add a statement about the concept, e.g. ...

<rdf:RDF xml:base="http://example.org/"> 
 
  <skos:Collection rdf:about="C1"> 
    <rdfs:label>chairs by form</rdfs:label> 
    <skos:members rdf:parseType="Collection"> 
      <skos:Concept rdf:about="A"/> 
      <skos:Concept rdf:about="B"/> 
      <skos:Concept rdf:about="C"/> 
    </skos:members> 
    <skos:length
rdf:datatype="http://www.w3.org/2001/XMLSchema#int">3</skos:length> 
  </skos:Collection> 
 
  <skos:Concept rdf:about="A"> 
    <skos:inCollection rdf:resource="C1"/> 
  <skos:Concept> 
 
  <!-- ... and so on for other concepts. --> 
 
</rdf:RDF> 

The main problem with the hypothetical skos:length and skos:inCollection
properties is that they introduce logical dependencies between statements
that must be maintained by any programs modifying the structure. In other
words, conflicting statements could be accidentally introduced.

Where RDF containers have been used to describe arrays, this is possible to
do via the rdfs:member, again provided that the repository has some
inference capability. If there is no inference it is impossible, unless a
workaround such as the one suggested above is used.

---
Alistair Miles
Research Associate
CCLRC - Rutherford Appleton Laboratory
Building R1 Room 1.60
Fermi Avenue
Chilton
Didcot
Oxfordshire OX11 0QX
United Kingdom
Email:        a.j.miles@rl.ac.uk
Tel: +44 (0)1235 445440



> -----Original Message-----
> From: public-esw-thes-request@w3.org
> [mailto:public-esw-thes-request@w3.org]On Behalf Of Miles, AJ 
> (Alistair)
> 
> Sent: 09 August 2004 15:28
> To: 'public-esw-thes@w3.org'
> Subject: [Requirement][SKOS-Core] Arrays of concepts
> 
> 
> 
> Hi all,
> 
> We made a good start on this issue earlier in the year, here 
> is a write up
> of the specific requirement ...
> 
> [see also 
> <http://esw.w3.org/topic/SkosDev/SkosCore/CollectionsAndArrays>]
> 
> Many thesauri group small sets of concepts under what's called a 'node
> label' or 'guide term', for example this from the AAT ... 
> 
> chairs 
>    <chairs by form> 
>       armchairs 
>       ax chairs 
>       backstools 
>       Barcelona chairs 
>       barrel chairs 
>       ...  
> 
> ... or this from the English Heritage thesaurus of historic 
> aircraft ... 
> 
> AIRCRAFT 
>      AIRCRAFT <BY FUNCTION> 
>           TEST AIRCRAFT 
>           FIGHTER 
>           BOMBER 
>           TRAINER 
>           TRANSPORTER 
>           RECONNAISSANCE 
>           TARGET 
>           ARMY COOPERATION 
>           TUG 
> 
> This type of collection of concepts is commonly called an 
> 'array', where the
> array label identifies some 'characteristic of division' for 
> the contents of
> that array. 
> 
> The consensus seems to be that the node label (i.e. 'chairs 
> by form' or
> 'aircraft by function') should not be modelled as a label for 
> a concept in
> its own right, but rather as a label for a collection of concepts. 
> 
> The matter is complicated further because in some arrays, the 
> ordering of
> concepts is meaningful. However, in other arrays the ordering 
> of concepts is
> not meaningful. The RDF description of an 'array' must 
> therefore provide a
> way to distinguish between these two cases, primarily so that 
> applications
> handling the data can know whether they should preserve the original
> ordering, or whether they are free to reorder the contents of 
> an array by
> some criterion, for example alphabetically. 
> 
> SKOS-Core requires some framework for supporting arrays of concepts as
> described here.
> 
> 
> 
> ---
> Alistair Miles
> Research Associate
> CCLRC - Rutherford Appleton Laboratory
> Building R1 Room 1.60
> Fermi Avenue
> Chilton
> Didcot
> Oxfordshire OX11 0QX
> United Kingdom
> Email:        a.j.miles@rl.ac.uk
> Tel: +44 (0)1235 445440
> 
> 
>
Received on Monday, 9 August 2004 15:31:33 UTC