Re: Modeling acronym and abbreviation Labels scenario

On 24/02/2013 21:23, Antoine Isaac wrote:
> Dear Stella,
>
> I am not sure I see the small correction you wanted to make...
Your version had "</skos:prefLabel>" where you intended " 
</skos:altLabel>" - just a slip of the keyboard.
>
> You are right on what happens if "Triple DES" and "3DES" are 
> represented as altLabel both. And I agree it may well not matter for 
> the scenario at hand!
>
> As for whether 3DES is a notation or not, I have (and SKOS makes) no 
> strong opinion about it. In fact I was considering the pattern used in 
> the example I have sent later
> http://id.loc.gov/vocabulary/cryptographicHashFunctions/sha-256
> were "SHA-256" is a notation, and it seems to me quite similar to 3DES.
I don't know enough about either of these use cases to advise. And my 
doubts come not from requirements in SKOS, but from expectations in the 
world of KOS users about the typical function of a notation.

In ISO 25964 [1], notation is defined as a "set of symbols representing 
a concept or class in a structured vocabulary, especially a 
classification scheme." There are also some examples and a Note: 
"Notation is sometimes used to sort and/or locate concepts in a 
pre-determined systematic order and, optionally, to display how the 
components of complex concepts have been structured and grouped. A 
notation can provide the link between alphabetical and systematic lists 
in a thesaurus."

If all or most of the concepts in their respective vocabularies have a 
notation in the same pattern, that can be used to present the concepts 
in a systematic order, it would seem appropriate to represent codes such 
as "3DES" and "SHA-256" as notations. Otherwise, it's unorthodox, 
perhaps even misleading.
All the best
Stella
[1] more info at http://www.niso.org/schemas/iso25964/
>
> Cheers,
>
> Antoine
>
>
>> Dear Bradley/Antoine,
>> I suspect that Antoine intended to write:
>> <skos:Concept rdf:about"#:tripleDataEncryptionAlgorithm">
>> <skos:prefLabel xml:lang="en">Triple Data Encryption 
>> Algorithm</skos:prefLabel>
>> <skos:hiddenLabel xml:lang="en">Triple DEZ</skos:hiddenLabel>
>> <skos:altLabel xml:lang="en">Triple DES</skos:altLabel>
>> <skos:notation 
>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">3DES</skos:notation>
>>
>> As well as the small correction in the above, I wonder why "3DES" 
>> would be handled as a notation. Bradley's message does not mention 
>> needing a notation. A notation has a different function from either 
>> abbreviations or acronyms.
>> Why not treat all the respectable alternatives as altlabel, thus:
>>
>> <skos:Concept rdf:about"#:tripleDataEncryptionAlgorithm">
>> <skos:prefLabel xml:lang="en">Triple Data Encryption 
>> Algorithm</skos:prefLabel>
>> <skos:hiddenLabel xml:lang="en">Triple DEZ</skos:hiddenLabel>
>> <skos:altLabel xml:lang="en">Triple DES</skos:altLabel>
>> <skos:altLabel xml:lang="en">3DES</skos:altLabel>
>>
>> If you do this, you fail to declare whether 3DES is an abbreviation 
>> or an acronym or a synonym or a near-synonym or a common name or a 
>> scientific name or etc, but in most applications it is unnecessary to 
>> specify what kind of a non-preferred term it is. Just ignore this 
>> suggestion If you DO need to pick out those non-preferred terms that 
>> are acronyms or abbreviations (or if you cannot accept more than one 
>> ordinary non-preferred term).
>> Stella
>>
>> *****************************************************
>> Stella Dextre Clarke
>> Information Consultant and Chair, ISKO UK
>> Luke House, West Hendred, Wantage, OX12 8RR, UK
>> Tel: 01235-833-298
>> Fax: 01235-863-298
>> stella@lukehouse.org
>> *****************************************************
>>
>>
>>
>>
>> On 24/02/2013 16:27, Antoine Isaac wrote:
>>> Dear Bradley,
>>>
>>> First sorry for the time it took...
>>>
>>> I'm actually not sure to understand the question. Are you searching 
>>> for more complex than
>>>
>>> <skos:Concept rdf:about"#:tripleDataEncryptionAlgorithm">
>>> <skos:prefLabel xml:lang="en">Triple Data Encryption 
>>> Algorithm</skos:prefLabel>
>>> <skos:hiddenLabel xml:lang="en">Triple DEZ</skos:hiddenLabel>
>>> <skos:altLabel xml:lang="en">Triple DES</skos:prefLabel>
>>> <skos:notation 
>>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">3DES</skos:notation>
>>>
>>> ?
>>> You can indeed create sub-properties of skos:prefLabel, 
>>> skos:altLabel, skos:hiddenLabel and skos:notation for representing 
>>> the exact "flavor" or your acronyms and abbreviations, but I'm not 
>>> sure this is what you really need, for simple text mining the 
>>> occurrence of concepts in documents.
>>>
>>> MADS/RDF offers finer grain. But similarly, I'm not sure you need it...
>>>
>>> Best,
>>>
>>> Antoine
>>>
>>>
>>>> Dear mailing list,
>>>>
>>>> I am trying to build a controlled vocabulary schema to be able to 
>>>> model something like RFC 4949 http://tools.ietf.org/html/rfc4949
>>>>
>>>> This controlled vocabulary has “separate” entries for the acronym, 
>>>> abbreviation, each slang/synonym, and canonical term. There are 
>>>> also deprecatedLabel.
>>>>
>>>> I do not want separate entries for each acronym/abbreviation as the 
>>>> MADs/rdf object properties hasAcronymVariant and 
>>>> hasAbbreviationVAriant suggests. Instead I want everything in one 
>>>> canonical entry. (reasons outline in Use Case Scenario below)
>>>>
>>>> For example in the RFC 4949, page 9 :
>>>>
>>>> prefLabel: Triple Data Encryption Algorithm
>>>>
>>>> hiddenLabel: Triple DEZ [I made up this slang]
>>>>
>>>> How would you model these 2 alternatives to the canonical Label in 
>>>> MADS/rdf?
>>>>
>>>> acronym:3DES
>>>>
>>>> abbreviation: Triple DES
>>>>
>>>> Use Case Scenario
>>>>
>>>> We want to build a master controlled vocabulary by text mining many 
>>>> glossaries such as RFC 4949. So we have to be able to process these 
>>>> varying labels and cross references.
>>>>
>>>> One approach is to model RFC 4949 using MDS/rdf as the 
>>>> specification suggests, and then use a some sort of 
>>>> inferencing/query to get the acronyms/abbreviations to “appear” as 
>>>> part of the canonical term using object properties. This leads to 
>>>> more term entries but makes it easy to text mine. This complicates 
>>>> XSLT transformation to .txt for further text mining.
>>>>
>>>> An alternate approach is to make one canonical entry for all label 
>>>> types for the text mining reason listed next which would simply the 
>>>> XSLT transformation from OWL to .txt
>>>>
>>>> We curate the multiple glossary inputs to ensure there is only one 
>>>> canonical idea presented ontologically/conceptually by a SME 
>>>> (either manually curate to ensure syntactically different labels 
>>>> for the same term are matched or SPARQL query to isolate duplicates 
>>>> or both techniques).
>>>>
>>>> Then we export the master term list as a .txt with preferred label, 
>>>> acronyms, symbols (QUDT ontology), abbreviations, and synonyms 
>>>> (altLabel). This acts as an input again for GATE so that we can 
>>>> text mine the true corpus that describes a product to build the 
>>>> knowledge base for that product.
>>>>
>>>> Right now our glossary has over 20,000 telecommunications terms 
>>>> (many complex and simple labels). So the design is important so we 
>>>> do not have a big job correcting populated design errors.
>>>>
>>>> Of course I can just model owl:acronym and owl:abbreviation under 
>>>> the approriate imported SKOS, SKOS-XL, and MADS/rdf data 
>>>> properties, but I would like to remain as close as possible to 
>>>> customary modeling.
>>>>
>>>> Any thoughts?
>>>>
>>>> *Bradley Shoebottom**
>>>> **Senior Information Architect – Research and Product Development*
>>>> Phone:*(506) 674-5439*| Toll-Free: *(800) 363-3358**
>>>> *Skype:*bradley.shoebottom*
>>>>
>>>> Email:*bradley.shoebottom@innovatia..net 
>>>> <mailto:bradley.shoebottom@innovatia.net>*
>>>>
>>>
>>>
>>
>>
>
>


-- 
*****************************************************
Stella Dextre Clarke
Information Consultant and Chair, ISKO UK
Luke House, West Hendred, Wantage, OX12 8RR, UK
Tel: 01235-833-298
Fax: 01235-863-298
stella@lukehouse.org
*****************************************************

Received on Monday, 25 February 2013 09:24:06 UTC