Re: how to: ordered collection of a Concept from Stella Dextre Clarke on 2013-11-16 (public-esw-thes@w3.org from November 2013)

From: Stella Dextre Clarke <stella@lukehouse.org>
Date: Sat, 16 Nov 2013 20:36:59 +0000
To: "ZENG, MARCIA" <mzeng@kent.edu>
CC: "vladimir.alexiev@ontotext.com" <vladimir.alexiev@ontotext.com>, "public-esw-thes@w3.org" <public-esw-thes@w3.org>, "L.Will@willpowerinfo.co.uk" <L.Will@willpowerinfo.co.uk>, Joan Cobb <JCobb@getty.edu>, "PHarpring@getty.edu" <PHarpring@getty.edu>, "Garcia, Gregg" <GGarcia@getty.edu>
Message-ID: <5287D76B.9070905@lukehouse.org>
On 16/11/2013 17:25, ZENG, MARCIA wrote:
> Hi, Stella, There have been two threats going on for the same
> questions.
Hopefully they are Threads rather than Threats :-)
But yes it's hard to reply to any of it without feeling a bit lost.

 > I am including all of them in this thread so they could see what you
 > suggested to Vladimir.
> I summarize two sorts of issues:
>
> Issue 1. Regarding ordered siblings. As I indicated before, 'ordered
> children' is the 'ordered siblings' issue. Patricia explained
> clearly: (1) In AAT, the siblings are by default alphabetical except
> if another order is strongly warranted (e.g., due to a time-based
> orientation, in cases where it would be confusing and seem wrong to
> expert end-users if the order were alphabetical). (2) The order is
> coded in the database, so the siblings being either a) alpha or b)
> forced. (3) Gregg did the scan of those ordered siblings. They are
> spread among 194 families, total about 2000 individuals.
I don't believe there is any issue or problem about ordering. It's a 
great feature, when you have the resources to apply it.
>
> I did not follow through the final decision after we indicated that
> skos:notation does not apply in AAT's case. However I think this
> still needs to be addressed and implied correctly: In principle, AAT
> does not employ a notation system, like almost all thesauri. The
> identifiers used by Gatty Vocabs do not possess semantics or
> systematic ordering meanings. Re: Vladmir's reply "I think that [AAT]
> identifiers quite match the definition of skos:notation given in the
> SKOS Primer and SKOS Reference (they don't say a notation should be
> sortable)." November 11, 2013 12:06 PM.  Now I think the meaning of
> skos:notation is broader than the best practices in structured
> vocabularies because we always think of a notation system (where
> 'system' implies the minimum characteristics). But in terms of
> definitions, both ISO 25964's and SKOS definitions did not emphasize
> on the systematic part. Maybe this could be re-visited?
I don't believe there is much problem here either. The ISO 25964 
definition of notation is supported by examples that make it pretty 
clear. Maybe the SKOS definition could be improved (but I hope to be 
lazy and leave that to someone else!) If any work is to be done, it 
should be in the context of standardizing classification schemes rather 
than thesauri.
>
> Issue 2. Regarding the node labels (and guide terms) I sent some
> suggestions last weekend, similar to yours regarding node labels and
> guide terms, after the discussions in the third threads among
> skos-iso members, especially Leonard's suggestions. I also sent the
> extracted definitions/explanations from ISO 25964-1 for some of the
> concepts discussed. My suggestions were: (1) Treat true node labels
> as node labels, keep one preferred in each language, no alternative
> label for any language. (--That was one of the questions.) (2) Some
> of the guide terms are clear concepts and AAT team is already dealing
> with them. (3) Some other guide terms are representing very general
> concepts but AAT does not want to use in indexing. I consider they
> are the labels for general concepts. (This is similar to your
> suggestion, Stella, right? "One workaround might be to ignore all
> those angle brackets and treat all the guide terms as true
> concepts.")
Marcia, it's plain that Getty has a project under way for dealing with 
node labels and I don't know enough about it to comment. My remarks 
about workarounds  were pretty limited, since I don't know what size of 
budget/workforce is available to overhaul the whole thesaurus. Your 
categories (1) and (2) sound straightforward enough, provided someone 
has the time/resource to sort them all out. But as for category (3) - 
general concepts - I'd prefer to look at some specific examples before 
making any suggestions. (The workaround you have quoted above is not one 
I'd really recommend, for the reason explained in my original message.)

Finally, it's great to know Patricia and her team are taking this 
project so seriously; I wish you every success in sorting it out.
Regards to All,
Stella
*****************************************************
Stella Dextre Clarke
Information Consultant and Project Leader, ISO NP25964
Luke House, West Hendred, Wantage, OX12 8RR, UK
Tel: 01235-833-298
Fax: 01235-863-298
stella@lukehouse.org
*****************************************************


>
> Marcia ________________________________________ From: Stella Dextre
> Clarke [stella@lukehouse.org] Sent: Saturday, November 16, 2013 10:00
> AM To: vladimir.alexiev@ontotext.com Cc: public-esw-thes@w3.org;
> L.Will@willpowerinfo.co.uk; ZENG, MARCIA Subject: Re: how to: ordered
> collection of a Concept
>
> Dear Vladimir, In earlier correspondence I think you said there is a
> commitment to apply the ISO 25964 model to the AAT? In my opinion the
> AAT is a wonderful vocabulary with many excellent features. But there
> are some challenges when applying the standard because in some
> respects the AAT does not follow ISO25964. I will not attempt to set
> out how you could/should represent the data in RDF, but I will try to
> pinpoint some of the challenges. Mostly I'll be using ISO25964
> parlance, which differs slightly from AAT-speak. I hope we can
> overcome any confusion!
>
> Addressing your points one by one:
>
> On 15/11/2013 03:58, Vladimir Alexiev wrote:
>>> I don't know how the AAT nowadays ensures the order of siblings
>>> in an array
>>
>> There's a field sortOrder. If the values are the same, that means
>> "not ordered", and AAT displays in alphabetical order of the EN
>> label.
> Ah yes, that sounds sensible.
>>
>>> Optionally, an array may have a node label. Optionally also, it
>>> may have a superordinate concept.
>>
>> Consider these two cases that actually appear in AAT:
>>
>> 1. C1 < C2,C3: C1 (a concept) is parent of C2,C3 which are ordered
>> 2. C1 < GT1 < C2,C3: C1 is parent of GT1 (a guide term), which in
>> turn is parent of C2,C3 which are ordered
>>
>> Case 2 is clear: we represent GT1 as an Array that is ordered.
>>
>> My question is how to represent case 1, so it can be distinguished
>> from case 2. In case 1 we also need to use an Array (there's
>> nothing else that can be ordered, since a skos:OrderedCollection
>> can't be put under anything). But it's an *inferior* array: it does
>> not exist separately from C1, it is the *same* as C1. I agree with
>> Leonard's suggestion to use an Array without node label (which I
>> called *anonymous*, sorry if that caused any confusion). And we'll
>> connect that inferior array to C1 using subordinateArray. Is that
>> the best practice then?
> I'm having difficulty understanding what you mean, probably because
> you and I may be using different terminology to describe the same
> situation. For example, take the expression "parent". For some people
> "parent" means the broader concept in a BT/NT relationship; for
> others it just means  up one level somehow in a visual hierarchical
> display.
>
> I'm also struggling to understand what is meant by an "inferior"
> array. Most of the thesauri I encounter do not have any node labels
> (or guide terms). When node labels are present they can help to
> articulate a hierarchical display, but do not cause the associated
> arrays to be superior or inferior. Maybe "inferior array" is another
> way of saying "subordinate array"? In that case, no problem. Whenever
> a thesaurus concept has more than one narrower concept at one level
> down, those narrower concepts form a subordinate array. (But I would
> not judge the subordinate array to be "the same as" its broader
> concept.)
>
> Would it all be clearer if we use some specific examples? I've
> concocted some in the attachment herewith, hoping they illustrate
> your Case 1 and Case 2. (And I've made it an attachment to avoid the
> indentation getting messed up by our email clients.)
>
> Please note that in my parlance, a node label is not part of an
> array, nor is it a parent of an array. It is simply a label
> associated with an array, and is conventionally shown in the line
> preceding the first term/concept in the array.
>
> Do these examples illustrate what you mean? If not, you could point
> to some real examples in the online AAT? We might need another
> example in any case, to illustrate the different situation with AAT
> guide terms that are not really node labels (because they are
> intended to show intermediate concepts in the hierarchy that are not
> recommended for use in indexing. e.g. "<emergency vessels>" ID
> 300232863)
>
> Clause 11 of ISO 25964 has more examples and explanations about node
> labels, which are useful if facet analysis is to be applied in a
> more elaborate way.
>>
>>> Implementation would proceed more comfortably, I suggest, if the
>>> treatment of arrays does not depend on existence of some kind of
>>> parent.
>>
>> I'm not sure what that means. For a thesaurus consumer (e.g.
>> implementer of a TMS or thesaurus visualization) it's important to
>> understand when to display a level. In case 1 above, he should
>> *not* display an extra level between the concepts. Which will
>> happen if we institute a practice "If an Array has no label, then
>> don't display it".
> Case 1 in the attachment shows an array with no node label. What's
> the problem?
>> This will work fine for AAT, but if someone makes a whole tree of
>> Arrays without labels, what would that mean? Oh well, that's for
>> thesaurus consumers to worry about :-)
> Take a look at the  MeSH Browser and you will find very extensive
> trees of concepts without node
> labels.<http://www.nlm.nih.gov/cgi/mesh/2013/MB_cgi>
>>
>>> Array must have at least one member concept
>>
> This is what we can see in the ISO 25964 model (see
> <http://www.niso.org/schemas/iso25964/Model_2011-06-02.jpg>)
>> Conceivably, it may have only member arrays, and the concepts may
>> come some levels further down?
> With the AAT, which displays guide terms almost as though they were
> concepts, it is possible to find arrays of guide terms only (NB a
> guide term alone is not an array). But this could be avoided if (a)
> in cases like the one of "emergency vessels" cited above, the
> concepts were recognised as such, and (b) the ISO 25964 definition of
> "hierarchical relationship" were adopted (relationship between a pair
> of concepts of which one has a scope falling completely within the
> scope of the other).
>
> As I see it part of your challenge arises from wanting to display
> guide terms as though they were concepts, and thus eligible for
> participating in hierarchical relationships. One workaround might be
> to ignore all those angle brackets and treat all the guide terms as
> true concepts. For the human reader, there is no problem interpreting
> the resultant display. (For example, in the hierarchical display for
> emergency vessels, it is easy to work out what is happening between
> watercraft and, say, fireboats. But if  a hierarchy like that is used
> for automatic inferencing, as in the Semantic Web, it would generate
> some peculiar inferences, such as: ' "watercraft by specific type" is
> a type of watercraft')
>
> A more logical workaround would not mix up guide terms with
> concepts, but would find a way of ensuring that hierarchical
> relationships are established *only* between concepts (not between
> terms, nor between a concept and a term, nor between guide terms, nor
> between a guide term and a concept). It should still be possible to
> display the guide terms "outdented" from their associated arrays (see
> the alternative presentation of Case 2 in my attachment), but a bit
> more programming would be needed to achieve this.
>>
>> ------
>>
>>> identifier "300106739" for "Iron Age" is not designed for use as
>>> a notation... the form taken by the notation system of a
>>> particular thesaurus can be highly idiosyncratic. ISO 25964
>>> ...does not make any assumptions about the way that notation will
>>> be used, either for ordering or anything else.
>>
>> If ISO does not pose constraints on notations, how did you judge
>> that "300106739" is not a notation?
> The first clue is that it looks typical of the sort of string
> commonly used for thesaurus identifiers. Confirmation comes from the
> label "ID" shown on the AAT online. For more detailed discussion,
> look at the ISO25964 definitions of notation and identifier. Even if
> you don't have a copy, you can find all the definitions freely at
> <https://www.iso.org/obp/ui/>.
>
> I've mapped it to skos:notation
>> because it satisfies the description for notation given in the
>> SKOS Primer and SKOS Reference. Anyway: when Marsha raised this
>> issue, I've recorded it as an AAT Question, and we'll resolve it a
>> bit later. If so decided, I'll turn that to dc:identifier.
> A bit of confusion is understandable, since in some systems,
> especially older ones, there is no ID separate from the notation. But
> better practice is to keep the ID separate from the notation (and the
> problem is completely removed if the thesaurus does not have any
> notation).
>
> Sorry my attempts at explanation seem rather long, but I hope the
> examples will help. Stella Dextre Clarke
>
>
> -- ***************************************************** Stella
> Dextre Clarke Information Consultant and Project Leader, ISO NP
> 25964 Luke House, West Hendred, Wantage, OX12 8RR, UK Tel:
> 01235-833-298 Fax: 01235-863-298 stella@lukehouse.org
> *****************************************************
>


--
Received on Saturday, 16 November 2013 20:37:29 UTC