W3C home > Mailing lists > Public > public-esw-thes@w3.org > February 2013

Re: Modeling acronym and abbreviation Labels scenario

From: Antoine Isaac <aisaac@few.vu.nl>
Date: Sun, 24 Feb 2013 22:23:53 +0100
Message-ID: <512A84E9.5090405@few.vu.nl>
To: <public-esw-thes@w3.org>
Dear Stella,

I am not sure I see the small correction you wanted to make...

You are right on what happens if "Triple DES" and "3DES" are represented as altLabel both. And I agree it may well not matter for the scenario at hand!

As for whether 3DES is a notation or not, I have (and SKOS makes) no strong opinion about it. In fact I was considering the pattern used in the example I have sent later
http://id.loc.gov/vocabulary/cryptographicHashFunctions/sha-256
were "SHA-256" is a notation, and it seems to me quite similar to 3DES.

Cheers,

Antoine


> Dear Bradley/Antoine,
> I suspect that Antoine intended to write:
> <skos:Concept rdf:about"#:tripleDataEncryptionAlgorithm">
> <skos:prefLabel xml:lang="en">Triple Data Encryption Algorithm</skos:prefLabel>
> <skos:hiddenLabel xml:lang="en">Triple DEZ</skos:hiddenLabel>
> <skos:altLabel xml:lang="en">Triple DES</skos:altLabel>
> <skos:notation rdf:datatype="http://www.w3.org/2001/XMLSchema#string">3DES</skos:notation>
>
> As well as the small correction in the above, I wonder why "3DES" would be handled as a notation. Bradley's message does not mention needing a notation. A notation has a different function from either abbreviations or acronyms.
> Why not treat all the respectable alternatives as altlabel, thus:
>
> <skos:Concept rdf:about"#:tripleDataEncryptionAlgorithm">
> <skos:prefLabel xml:lang="en">Triple Data Encryption Algorithm</skos:prefLabel>
> <skos:hiddenLabel xml:lang="en">Triple DEZ</skos:hiddenLabel>
> <skos:altLabel xml:lang="en">Triple DES</skos:altLabel>
> <skos:altLabel xml:lang="en">3DES</skos:altLabel>
>
> If you do this, you fail to declare whether 3DES is an abbreviation or an acronym or a synonym or a near-synonym or a common name or a scientific name or etc, but in most applications it is unnecessary to specify what kind of a non-preferred term it is. Just ignore this suggestion If you DO need to pick out those non-preferred terms that are acronyms or abbreviations (or if you cannot accept more than one ordinary non-preferred term).
> Stella
>
> *****************************************************
> Stella Dextre Clarke
> Information Consultant and Chair, ISKO UK
> Luke House, West Hendred, Wantage, OX12 8RR, UK
> Tel: 01235-833-298
> Fax: 01235-863-298
> stella@lukehouse.org
> *****************************************************
>
>
>
>
> On 24/02/2013 16:27, Antoine Isaac wrote:
>> Dear Bradley,
>>
>> First sorry for the time it took...
>>
>> I'm actually not sure to understand the question. Are you searching for more complex than
>>
>> <skos:Concept rdf:about"#:tripleDataEncryptionAlgorithm">
>> <skos:prefLabel xml:lang="en">Triple Data Encryption Algorithm</skos:prefLabel>
>> <skos:hiddenLabel xml:lang="en">Triple DEZ</skos:hiddenLabel>
>> <skos:altLabel xml:lang="en">Triple DES</skos:prefLabel>
>> <skos:notation rdf:datatype="http://www.w3.org/2001/XMLSchema#string">3DES</skos:notation>
>>
>> ?
>> You can indeed create sub-properties of skos:prefLabel, skos:altLabel, skos:hiddenLabel and skos:notation for representing the exact "flavor" or your acronyms and abbreviations, but I'm not sure this is what you really need, for simple text mining the occurrence of concepts in documents.
>>
>> MADS/RDF offers finer grain. But similarly, I'm not sure you need it...
>>
>> Best,
>>
>> Antoine
>>
>>
>>> Dear mailing list,
>>>
>>> I am trying to build a controlled vocabulary schema to be able to model something like RFC 4949 http://tools.ietf.org/html/rfc4949
>>>
>>> This controlled vocabulary has “separate” entries for the acronym, abbreviation, each slang/synonym, and canonical term. There are also deprecatedLabel.
>>>
>>> I do not want separate entries for each acronym/abbreviation as the MADs/rdf object properties hasAcronymVariant and hasAbbreviationVAriant suggests. Instead I want everything in one canonical entry. (reasons outline in Use Case Scenario below)
>>>
>>> For example in the RFC 4949, page 9 :
>>>
>>> prefLabel: Triple Data Encryption Algorithm
>>>
>>> hiddenLabel: Triple DEZ [I made up this slang]
>>>
>>> How would you model these 2 alternatives to the canonical Label in MADS/rdf?
>>>
>>> acronym:3DES
>>>
>>> abbreviation: Triple DES
>>>
>>> Use Case Scenario
>>>
>>> We want to build a master controlled vocabulary by text mining many glossaries such as RFC 4949. So we have to be able to process these varying labels and cross references.
>>>
>>> One approach is to model RFC 4949 using MDS/rdf as the specification suggests, and then use a some sort of inferencing/query to get the acronyms/abbreviations to “appear” as part of the canonical term using object properties. This leads to more term entries but makes it easy to text mine. This complicates XSLT transformation to .txt for further text mining.
>>>
>>> An alternate approach is to make one canonical entry for all label types for the text mining reason listed next which would simply the XSLT transformation from OWL to .txt
>>>
>>> We curate the multiple glossary inputs to ensure there is only one canonical idea presented ontologically/conceptually by a SME (either manually curate to ensure syntactically different labels for the same term are matched or SPARQL query to isolate duplicates or both techniques).
>>>
>>> Then we export the master term list as a .txt with preferred label, acronyms, symbols (QUDT ontology), abbreviations, and synonyms (altLabel). This acts as an input again for GATE so that we can text mine the true corpus that describes a product to build the knowledge base for that product.
>>>
>>> Right now our glossary has over 20,000 telecommunications terms (many complex and simple labels). So the design is important so we do not have a big job correcting populated design errors.
>>>
>>> Of course I can just model owl:acronym and owl:abbreviation under the approriate imported SKOS, SKOS-XL, and MADS/rdf data properties, but I would like to remain as close as possible to customary modeling.
>>>
>>> Any thoughts?
>>>
>>> *Bradley Shoebottom**
>>> **Senior Information Architect – Research and Product Development*
>>> Phone:*(506) 674-5439*| Toll-Free: *(800) 363-3358**
>>> *Skype:*bradley.shoebottom*
>>>
>>> Email:*bradley.shoebottom@innovatia..net <mailto:bradley.shoebottom@innovatia.net>*
>>>
>>
>>
>
>
Received on Sunday, 24 February 2013 21:24:25 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 2 March 2016 13:32:17 UTC