- From: Johan De Smedt <johan.de-smedt@tenforce.com>
- Date: Mon, 18 Nov 2013 07:25:18 +0100
- To: "'Stella Dextre Clarke'" <stella@lukehouse.org>, "'ZENG, MARCIA'" <mzeng@kent.edu>
- Cc: <vladimir.alexiev@ontotext.com>, <public-esw-thes@w3.org>, <L.Will@willpowerinfo.co.uk>, "'Joan Cobb'" <JCobb@getty.edu>, <PHarpring@getty.edu>, "'Garcia, Gregg'" <GGarcia@getty.edu>
Hi Marcia, Stella, Vladimir, My understanding is that the current schema provides all that is needed concerning the discussed problems. 1) ordering. - For ordering, "skos" provides OrderedCollections. "iso-thes" provides for ThesaurusArray is a "skos" Collection that can be a "skos" OrderedCollection. This is an explicit ordering - so has priority. - Representation of identifiers and notation are documented in "iso-thes" to be represented by "dct" identifier and "skos" notation. Any instruction on ordering or lack of order is specific for the used "notation" system; identifier on the other hand typically has no implied ordering. If the thesaurus systems implies (by documentation) an ordering on "notation", this could be used in absence of ("iso-thes" ThesaurusArray that are) "skos" OrderedCollection. Such documentation cannot be in scope of either SKOS or ISO-25864. It is in scope of a particular thesaurus application. This would be the first fallback scenario for ordering, in case there are no ordered collections or these cannot be handled by a display system. - Concepts have preferred terms per language. Typically one preferred term is required per language. In general, for multi-lingual thesauri with a high number of supported languages, a limited number of those languages is seen as "assured for having a preferred term on all concepts - whatever the state". Sorting by preferred term is the ultimate fallback. 2) node label - The iso-thes model documents a "skos-xl" prefLabel (with range "skos-xl" Label) can be used to represent the node label of an "iso-thes" ThesaurusArray (and ConceptGroup). - As per ISO 25964, the prefLabel is optional and there as at most one per language. - It is possible for any specific application to use additional labels (e.g. "skos-xl" altLabel) if needed. Documentation and semantics of such additonal label usage is outside of ISO 25964 and outside of SKOS. - By "skos-xl" rule S55, existence of a skos-xl:prefLabel implies a skos:prefLabel will exist for the same subject (occasionally a ThesaurusArray). By "skos" rule 11, existence of a skos:prefLabel implies also an rdfs:label will exist for the same subject (occasionally a ThesaurusArray). Is my understanding that this provides sufficient tools for AAT correct? Kind Regards, Johan De Smedt > -----Original Message----- > From: Stella Dextre Clarke [mailto:stella@lukehouse.org] > Sent: Saturday, 16 November, 2013 21:37 > To: ZENG, MARCIA > Cc: vladimir.alexiev@ontotext.com; public-esw-thes@w3.org; L.Will@willpowerinfo.co.uk; Joan Cobb; > PHarpring@getty.edu; Garcia, Gregg > Subject: Re: how to: ordered collection of a Concept > > On 16/11/2013 17:25, ZENG, MARCIA wrote: > > Hi, Stella, There have been two threats going on for the same > > questions. > Hopefully they are Threads rather than Threats :-) > But yes it's hard to reply to any of it without feeling a bit lost. > > > I am including all of them in this thread so they could see what you > > suggested to Vladimir. > > I summarize two sorts of issues: > > > > Issue 1. Regarding ordered siblings. As I indicated before, 'ordered > > children' is the 'ordered siblings' issue. Patricia explained > > clearly: (1) In AAT, the siblings are by default alphabetical except > > if another order is strongly warranted (e.g., due to a time-based > > orientation, in cases where it would be confusing and seem wrong to > > expert end-users if the order were alphabetical). (2) The order is > > coded in the database, so the siblings being either a) alpha or b) > > forced. (3) Gregg did the scan of those ordered siblings. They are > > spread among 194 families, total about 2000 individuals. > I don't believe there is any issue or problem about ordering. It's a > great feature, when you have the resources to apply it. > > > > I did not follow through the final decision after we indicated that > > skos:notation does not apply in AAT's case. However I think this > > still needs to be addressed and implied correctly: In principle, AAT > > does not employ a notation system, like almost all thesauri. The > > identifiers used by Gatty Vocabs do not possess semantics or > > systematic ordering meanings. Re: Vladmir's reply "I think that [AAT] > > identifiers quite match the definition of skos:notation given in the > > SKOS Primer and SKOS Reference (they don't say a notation should be > > sortable)." November 11, 2013 12:06 PM. Now I think the meaning of > > skos:notation is broader than the best practices in structured > > vocabularies because we always think of a notation system (where > > 'system' implies the minimum characteristics). But in terms of > > definitions, both ISO 25964's and SKOS definitions did not emphasize > > on the systematic part. Maybe this could be re-visited? > I don't believe there is much problem here either. The ISO 25964 > definition of notation is supported by examples that make it pretty > clear. Maybe the SKOS definition could be improved (but I hope to be > lazy and leave that to someone else!) If any work is to be done, it > should be in the context of standardizing classification schemes rather > than thesauri. > > > > Issue 2. Regarding the node labels (and guide terms) I sent some > > suggestions last weekend, similar to yours regarding node labels and > > guide terms, after the discussions in the third threads among > > skos-iso members, especially Leonard's suggestions. I also sent the > > extracted definitions/explanations from ISO 25964-1 for some of the > > concepts discussed. My suggestions were: (1) Treat true node labels > > as node labels, keep one preferred in each language, no alternative > > label for any language. (--That was one of the questions.) (2) Some > > of the guide terms are clear concepts and AAT team is already dealing > > with them. (3) Some other guide terms are representing very general > > concepts but AAT does not want to use in indexing. I consider they > > are the labels for general concepts. (This is similar to your > > suggestion, Stella, right? "One workaround might be to ignore all > > those angle brackets and treat all the guide terms as true > > concepts.") > Marcia, it's plain that Getty has a project under way for dealing with > node labels and I don't know enough about it to comment. My remarks > about workarounds were pretty limited, since I don't know what size of > budget/workforce is available to overhaul the whole thesaurus. Your > categories (1) and (2) sound straightforward enough, provided someone > has the time/resource to sort them all out. But as for category (3) - > general concepts - I'd prefer to look at some specific examples before > making any suggestions. (The workaround you have quoted above is not one > I'd really recommend, for the reason explained in my original message.) > > Finally, it's great to know Patricia and her team are taking this > project so seriously; I wish you every success in sorting it out. > Regards to All, > Stella > ***************************************************** > Stella Dextre Clarke > Information Consultant and Project Leader, ISO NP25964 > Luke House, West Hendred, Wantage, OX12 8RR, UK > Tel: 01235-833-298 > Fax: 01235-863-298 > stella@lukehouse.org > ***************************************************** > > > > > > Marcia ________________________________________ From: Stella Dextre > > Clarke [stella@lukehouse.org] Sent: Saturday, November 16, 2013 10:00 > > AM To: vladimir.alexiev@ontotext.com Cc: public-esw-thes@w3.org; > > L.Will@willpowerinfo.co.uk; ZENG, MARCIA Subject: Re: how to: ordered > > collection of a Concept > > > > Dear Vladimir, In earlier correspondence I think you said there is a > > commitment to apply the ISO 25964 model to the AAT? In my opinion the > > AAT is a wonderful vocabulary with many excellent features. But there > > are some challenges when applying the standard because in some > > respects the AAT does not follow ISO25964. I will not attempt to set > > out how you could/should represent the data in RDF, but I will try to > > pinpoint some of the challenges. Mostly I'll be using ISO25964 > > parlance, which differs slightly from AAT-speak. I hope we can > > overcome any confusion! > > > > Addressing your points one by one: > > > > On 15/11/2013 03:58, Vladimir Alexiev wrote: > >>> I don't know how the AAT nowadays ensures the order of siblings > >>> in an array > >> > >> There's a field sortOrder. If the values are the same, that means > >> "not ordered", and AAT displays in alphabetical order of the EN > >> label. > > Ah yes, that sounds sensible. > >> > >>> Optionally, an array may have a node label. Optionally also, it > >>> may have a superordinate concept. > >> > >> Consider these two cases that actually appear in AAT: > >> > >> 1. C1 < C2,C3: C1 (a concept) is parent of C2,C3 which are ordered > >> 2. C1 < GT1 < C2,C3: C1 is parent of GT1 (a guide term), which in > >> turn is parent of C2,C3 which are ordered > >> > >> Case 2 is clear: we represent GT1 as an Array that is ordered. > >> > >> My question is how to represent case 1, so it can be distinguished > >> from case 2. In case 1 we also need to use an Array (there's > >> nothing else that can be ordered, since a skos:OrderedCollection > >> can't be put under anything). But it's an *inferior* array: it does > >> not exist separately from C1, it is the *same* as C1. I agree with > >> Leonard's suggestion to use an Array without node label (which I > >> called *anonymous*, sorry if that caused any confusion). And we'll > >> connect that inferior array to C1 using subordinateArray. Is that > >> the best practice then? > > I'm having difficulty understanding what you mean, probably because > > you and I may be using different terminology to describe the same > > situation. For example, take the expression "parent". For some people > > "parent" means the broader concept in a BT/NT relationship; for > > others it just means up one level somehow in a visual hierarchical > > display. > > > > I'm also struggling to understand what is meant by an "inferior" > > array. Most of the thesauri I encounter do not have any node labels > > (or guide terms). When node labels are present they can help to > > articulate a hierarchical display, but do not cause the associated > > arrays to be superior or inferior. Maybe "inferior array" is another > > way of saying "subordinate array"? In that case, no problem. Whenever > > a thesaurus concept has more than one narrower concept at one level > > down, those narrower concepts form a subordinate array. (But I would > > not judge the subordinate array to be "the same as" its broader > > concept.) > > > > Would it all be clearer if we use some specific examples? I've > > concocted some in the attachment herewith, hoping they illustrate > > your Case 1 and Case 2. (And I've made it an attachment to avoid the > > indentation getting messed up by our email clients.) > > > > Please note that in my parlance, a node label is not part of an > > array, nor is it a parent of an array. It is simply a label > > associated with an array, and is conventionally shown in the line > > preceding the first term/concept in the array. > > > > Do these examples illustrate what you mean? If not, you could point > > to some real examples in the online AAT? We might need another > > example in any case, to illustrate the different situation with AAT > > guide terms that are not really node labels (because they are > > intended to show intermediate concepts in the hierarchy that are not > > recommended for use in indexing. e.g. "<emergency vessels>" ID > > 300232863) > > > > Clause 11 of ISO 25964 has more examples and explanations about node > > labels, which are useful if facet analysis is to be applied in a > > more elaborate way. > >> > >>> Implementation would proceed more comfortably, I suggest, if the > >>> treatment of arrays does not depend on existence of some kind of > >>> parent. > >> > >> I'm not sure what that means. For a thesaurus consumer (e.g. > >> implementer of a TMS or thesaurus visualization) it's important to > >> understand when to display a level. In case 1 above, he should > >> *not* display an extra level between the concepts. Which will > >> happen if we institute a practice "If an Array has no label, then > >> don't display it". > > Case 1 in the attachment shows an array with no node label. What's > > the problem? > >> This will work fine for AAT, but if someone makes a whole tree of > >> Arrays without labels, what would that mean? Oh well, that's for > >> thesaurus consumers to worry about :-) > > Take a look at the MeSH Browser and you will find very extensive > > trees of concepts without node > > labels.<http://www.nlm.nih.gov/cgi/mesh/2013/MB_cgi> > >> > >>> Array must have at least one member concept > >> > > This is what we can see in the ISO 25964 model (see > > <http://www.niso.org/schemas/iso25964/Model_2011-06-02.jpg>) > >> Conceivably, it may have only member arrays, and the concepts may > >> come some levels further down? > > With the AAT, which displays guide terms almost as though they were > > concepts, it is possible to find arrays of guide terms only (NB a > > guide term alone is not an array). But this could be avoided if (a) > > in cases like the one of "emergency vessels" cited above, the > > concepts were recognised as such, and (b) the ISO 25964 definition of > > "hierarchical relationship" were adopted (relationship between a pair > > of concepts of which one has a scope falling completely within the > > scope of the other). > > > > As I see it part of your challenge arises from wanting to display > > guide terms as though they were concepts, and thus eligible for > > participating in hierarchical relationships. One workaround might be > > to ignore all those angle brackets and treat all the guide terms as > > true concepts. For the human reader, there is no problem interpreting > > the resultant display. (For example, in the hierarchical display for > > emergency vessels, it is easy to work out what is happening between > > watercraft and, say, fireboats. But if a hierarchy like that is used > > for automatic inferencing, as in the Semantic Web, it would generate > > some peculiar inferences, such as: ' "watercraft by specific type" is > > a type of watercraft') > > > > A more logical workaround would not mix up guide terms with > > concepts, but would find a way of ensuring that hierarchical > > relationships are established *only* between concepts (not between > > terms, nor between a concept and a term, nor between guide terms, nor > > between a guide term and a concept). It should still be possible to > > display the guide terms "outdented" from their associated arrays (see > > the alternative presentation of Case 2 in my attachment), but a bit > > more programming would be needed to achieve this. > >> > >> ------ > >> > >>> identifier "300106739" for "Iron Age" is not designed for use as > >>> a notation... the form taken by the notation system of a > >>> particular thesaurus can be highly idiosyncratic. ISO 25964 > >>> ...does not make any assumptions about the way that notation will > >>> be used, either for ordering or anything else. > >> > >> If ISO does not pose constraints on notations, how did you judge > >> that "300106739" is not a notation? > > The first clue is that it looks typical of the sort of string > > commonly used for thesaurus identifiers. Confirmation comes from the > > label "ID" shown on the AAT online. For more detailed discussion, > > look at the ISO25964 definitions of notation and identifier. Even if > > you don't have a copy, you can find all the definitions freely at > > <https://www.iso.org/obp/ui/>. > > > > I've mapped it to skos:notation > >> because it satisfies the description for notation given in the > >> SKOS Primer and SKOS Reference. Anyway: when Marsha raised this > >> issue, I've recorded it as an AAT Question, and we'll resolve it a > >> bit later. If so decided, I'll turn that to dc:identifier. > > A bit of confusion is understandable, since in some systems, > > especially older ones, there is no ID separate from the notation. But > > better practice is to keep the ID separate from the notation (and the > > problem is completely removed if the thesaurus does not have any > > notation). > > > > Sorry my attempts at explanation seem rather long, but I hope the > > examples will help. Stella Dextre Clarke > > > > > > -- ***************************************************** Stella > > Dextre Clarke Information Consultant and Project Leader, ISO NP > > 25964 Luke House, West Hendred, Wantage, OX12 8RR, UK Tel: > > 01235-833-298 Fax: 01235-863-298 stella@lukehouse.org > > ***************************************************** > > > > > --
Received on Monday, 18 November 2013 06:26:36 UTC