W3C home > Mailing lists > Public > public-esw-thes@w3.org > March 2004

FW: faceted classification

From: Stella Dextre Clarke <sdclarke@lukehouse.demon.co.uk>
Date: Wed, 24 Mar 2004 10:40:17 -0000
To: <public-esw-thes@w3.org>
Message-ID: <000001c4118c$6651b160$0402a8c0@DELL>

This is a very interesting thread of conversation, but could I just
check we are all on the same track? (Because, I confess, I am getting a
bit lost.)

We started with Alistair pointing out people use the word 'facet' in
lots of different ways, and drawing attention to 2 significant ones: as
fundamental categories and as characteristics of division, respectively.
Then Leonard expanded on the difference, giving some very helpful
examples. Aida then pointed out a very important thing: there is a
difference between the simple exercise of analysing objects according to
facets, and the more complex challenge of using facet analysis to
organize knowledge. The concepts in a thesaurus vary from very simple
(e.g. 'People', 'Disabilities' 'Education') to more complex (e.g. 'Cost
benefit analysis', 'Disabled children', 'Special needs education') and a
sophisticated faceted classification will try to find a unique way of
coding combinations of complex concepts (e.g. Cost benefit analysis of
providing special needs education for disabled children). So far so
good.

Now, where I have got lost is in how these successively more complex
concepts and combinations are to be represented in XML, XFML, RDF, SKOS
etc. If the SKOS schema is to be able to represent thesaurus data, it
must be able to treat both simple (Education) and complex (Special needs
education) concepts as single concepts. It should preferably also be
able to show that each concept belongs to a particular (ideally just
one, but occasionally more than one) facet (e.g. 'People' and 'Disabled
children' belong to the facet 'agents' or possibly the facet 'living
organisms' or even 'patients', and 'Education' and 'Special needs
education' might belong to the facet 'processes'). If it was really
clever, SKOS might also be able to show that within the hierarchy of
'Education','Special needs education' belongs not in the array
'Education, by age group', where 'Primary education', 'Secondary
education'  and 'Adult education' are to be found, but in an array named
'Education, by needs'. A still further challenge for display purposes is
to be able to present primary, secondary and adult in systematic order,
not alphabetical order. Let's stop short of representing the
combinations of complex concepts, since we've already got enough to
consider.

One thing that makes me  uncomfortable is designating Facet as a
Property. I think  of Facets as being themselves concepts or classes,
very simple and general, chosen so that so that a small number of them
is enough to contain every other concept in the thesaurus. The terms
that name the facets are at the tops of just a few hierarchies which can
be used to organize the whole of the thesaurus.  It is possible to have
a hierarchical relationship between People and Agents, so how can we say
Agents is a Property? Or is it true to say that all relationships are
properties, in which case the problem may just be in my imagination?

At present the SKOS schema says, "Facets provide a means of organising
concepts along orthogonal dimensions. A facet is treated as a concept. A
facet may have member concepts. A concept may be a member of only one
facet". That strikes me as fine, except that in practice it is hard to
set up useful facets that are mutually exclusive. (This point is
illustrated in the cases above. 'Patient' and 'agent' are very commonly
selected as facets, but a disabled child can be a patient when receiving
treatment and an agent when his congenital clumsiness causes accidents.
It all depends on the context of a particular document or search query.)

Perhaps we are really all on the same track, and where I have come
unstuck is just in the formalism of how to represent things in the
various mark-up languages and schemas. In which case I fully endorse
Alistair's suggestion, "Maybe someone should sit down and write a white
paper on how to do 'faceted classification' to support web catalogue
browsing using RDF, RDFS and OWL (including some XSLT to generate RDF
from XFML)???"

As to Alistair's' representation of an array, sorry, no comment yet. I
just have not understood well enough how it is going to be used. But I
look forward to hearing from everyone else.
Thanks
Stella

*****************************************************
Stella Dextre Clarke
Information Consultant
Luke House, West Hendred, Wantage, Oxon, OX12 8RR, UK
Tel: 01235-833-298
Fax: 01235-863-298
SDClarke@LukeHouse.demon.co.uk
*****************************************************



-----Original Message-----
From: public-esw-thes-request@w3.org
[mailto:public-esw-thes-request@w3.org] On Behalf Of NJ Rogers, Learning
and Research Technology
Sent: 23 March 2004 12:26
To: Leonard Will; Miles, AJ (Alistair)
Cc: Douglas Tudhope (E-mail); Stella Dextre Clarke (E-mail);
'public-esw-thes@w3.org'
Subject: Re: faceted classification



Hi All

I find Leonard's reply helpful here.

Looking at

>Detergents
>      <detergents by form>
>      liquid detergents
>      gel detergents
>      powder detergents
>
>      <detergents by scent>
>      citrus scented detergents
>         lemon scented detergents
>         orange scented detergents
>      pine scented detergents
>
>      <detergents by brand name>
>      Persil detergents
>      Daz detergents
>      Surf detergents

from an rdf-world point of view, I can imagine some rdf:resource, X, and
I 
might want to say:

[X] type [detergent]
[X] physical state [powder]
[X] scent [citrus scents]
....

This is not N3 or anything, but you can see what I'm saying,
(X has an rdf:type of detergent, its physical state is powder etc). And 
here 'detergent', 'powder' etc would be encoded as skos concepts for
this 
particular thesaurus.

But what to do about our 'facets': physical state, scent etc? We don't
want 
them to be treated like skos:concepts because basically they are 
rdf:properties.  Therefore they form an extensional part of skos:core,
I'm 
thinking. i.e. to migrate a thesaurus that includes facets in the sense
of 
fundemental categories, one should give these facets as properties:

rdf:Property rdf:ID="physicalState"
	<rdfs:label>Physical State<rdfs:label>
	<rdfs:subPropertyOf rdf:resource="skos#facet"/>
	<rdfs:range rdf:resource="#Solids>
	<rdfs:range rdf:resource="#Gels>
	....
	<rdfs:comment>
	</rdfs:comment>
</rdf:Property>


where any facet is declared as a subproperty of skosfacet, say, & which
is 
not to be linked into OWL or rdfs.

Putting in the range resources (which should all be skos:Concept's)
would 
enable an application to switch views on instance data in the way
Leonard 
describes below. And the hierarchy of skos concepts would enable an 
application to determine that concepts under 'Solids' also are in the
range 
of the physicalState property, and so on.

As for the 'backwards pointer' - the ability to express for any given
skos 
Concept its facet - I haven't thought through whether or not that is 
necessary.

Nikki







--On Monday, March 22, 2004 14:59:17 +0000 Leonard Will 
<L.Will@willpowerinfo.co.uk> wrote:

>
> In message 
> <350DC7048372D31197F200902773DF4C04944189@exchange11.rl.ac.uk> on Mon,

> 22 Mar 2004, "Miles, AJ (Alistair)" <A.J.Miles@rl.ac.uk> wrote
>> Hi Doug, Leonard, Stella,
>>
>> I just read the article [1] that Doug forwarded.  I was wondering if 
>> you could help to me to clear something up.
>>
>> I have so far come across two meanings for 'faceted classification':
>>
>> (Sense 1)
>>
>> A set of things are 'classified' according to their properties.  For 
>> example (from [1]) a set of detergents are classified by 'brand 
>> name', 'form', 'scent', 'agent', 'effect on agent' and 'special 
>> property'.  In this sense, each of these properties represents a 
>> 'facet' through which the set of instances can be viewed.
>>
>> (Sense 2)
>>
>> A set of 'concepts' are grouped according to their most primitive 
>> type. For example, the concept 'marble' is placed in the 'materials' 
>> facet. The concept 'insects' is placed in the 'organisms' facet.  In 
>> this sense, a 'facet' is essentially a primitive class, and every 
>> member of a facet group is either an instance or a sub-class of that 
>> class.
>>
>> So my first question is: have I described these two senses accurately

>> (or am I missing something)?
>
> Alistair
>
> Congratulations on recognising at first reading an ambiguity that I've

> been banging on about for years!
>
> The terminology of faceted classification is indeed not well 
> controlled. :-(
>
> Some people use the expression "subfacets" for your sense 1 and 
> "fundamental facets" for your sense 2, but I think that this is 
> confusing because they are indeed different, and not specific types of

> some more general things called "facets".
>
> The main need, as you point out, is to distinguish these two senses 
> and to use different names for them. The choice as to which should be 
> given the word "facets" is difficult, as there is warrant for either, 
> but the interpretation I have advocated, and which is in the draft 
> British Standard currently under preparation, is to use "facets" for 
> your sense 2, i.e. "fundamental facets", sometimes called "fundamental

> categories".
>
> For sense 1, an accepted terminology is to say that concepts are 
> grouped into "arrays" according to specified "characteristics of 
> division". The "characteristic of division" is shown in a node label, 
> after the word "by". These node labels are not descriptors, but 
> explain the grouping of concepts in the array that they introduce. You

> then have a hierarchy such as the following which shows three arrays:
>
> detergents
>      <detergents by form>
>      liquid detergents
>      gel detergents
>      powder detergents
>
>      <detergents by scent>
>      citrus scented detergents
>         lemon scented detergents
>         orange scented detergents
>      pine scented detergents
>
>      <detergents by brand name>
>      Persil detergents
>      Daz detergents
>      Surf detergents
>
> Note that in this case the arrays all list kinds of detergents. If you

> wanted to separate out the properties from the detergents, you would 
> have distinct hierarchies such as
>
> scents
>      citrus scents
>         lemon scents
>         orange scents
>
> physical states
>      solids
>         powders
>      gels
>      liquids
>
> and these would belong to different [fundamental] facets.
>
> The AAT puts "scents" (i.e. "odors") into an  "environmental concepts"

> facet and "powders" into a "materials" facet under the node label 
> <materials by physical form>.
>
> In this case these terms would come together with detergents only when

> both descriptors were assigned to an item when that item was being 
> indexed.
>
> I hope other people agree with this description.
>
> Leonard
>
>
>
>
>
>
>
>
>
>>
>> My second question is: are there any other senses of 'faceted 
>> classification' worth considering?
>>
>> Finally: if my analysis is correct, these two senses describe quite 
>> different systems of organisation (*).  So would it be useful in the 
>> short term to come up with unambiguous names for these two meanings? 
>> For example, we could refer to sense 1 as 'classification by 
>> description' and sense 2 as 'primitive classification'.
>>
>> Please let me know what you think.
>>
>> Yours,
>>
>> Alistair.
>>
>>
>> (*) although sense 2 could be viewed as a special case of sense 1, in

>> which concepts are classified according to the value of a 'primitive 
>> type' property - i.e. 'primitive type' represents a 'facet' in sense 
>> 1!
>>
>> [1] <http://www.miskatonic.org/library/facet-web-howto.html>
>>
>> ---
>> Alistair Miles
>> Research Associate
>> CCLRC - Rutherford Appleton Laboratory
>> Building R1 Room 1.60
>> Fermi Avenue
>> Chilton
>> Didcot
>> Oxfordshire OX11 0QX
>> United Kingdom
>> Email:        a.j.miles@rl.ac.uk
>> Tel: +44 (0)1235 445440
>>
>>
>
> --
> Willpower Information       (Partners: Dr Leonard D Will, Sheena E
Will)
> Information Management Consultants              Tel: +44 (0)20 8372
0092
> 27 Calshot Way, Enfield, Middlesex EN2 7BQ, UK. Fax: +44 (0)870 051
7276
> L.Will@Willpowerinfo.co.uk
Sheena.Will@Willpowerinfo.co.uk
> ---------------- <URL:http://www.willpowerinfo.co.uk/> 
> -----------------
>
>



----------------------
NJ Rogers, Technical Researcher
(Semantic Web Applications Developer)
Institute for Learning and Research Technology (ILRT)
Email:nikki.rogers@bristol.ac.uk
Tel: +44(0)117 9287096 (Direct)
Tel: +44(0)117 9287193 (Office)
Received on Wednesday, 24 March 2004 05:53:10 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:38:52 GMT