- From: Stella Dextre Clarke <stella@lukehouse.org>
- Date: Sun, 24 Feb 2013 18:30:28 +0000
- To: Antoine Isaac <aisaac@few.vu.nl>
- CC: public-esw-thes@w3.org
Dear Bradley/Antoine, I suspect that Antoine intended to write: <skos:Concept rdf:about"#:tripleDataEncryptionAlgorithm"> <skos:prefLabel xml:lang="en">Triple Data Encryption Algorithm</skos:prefLabel> <skos:hiddenLabel xml:lang="en">Triple DEZ</skos:hiddenLabel> <skos:altLabel xml:lang="en">Triple DES</skos:altLabel> <skos:notation rdf:datatype="http://www.w3.org/2001/XMLSchema#string">3DES</skos:notation> As well as the small correction in the above, I wonder why "3DES" would be handled as a notation. Bradley's message does not mention needing a notation. A notation has a different function from either abbreviations or acronyms. Why not treat all the respectable alternatives as altlabel, thus: <skos:Concept rdf:about"#:tripleDataEncryptionAlgorithm"> <skos:prefLabel xml:lang="en">Triple Data Encryption Algorithm</skos:prefLabel> <skos:hiddenLabel xml:lang="en">Triple DEZ</skos:hiddenLabel> <skos:altLabel xml:lang="en">Triple DES</skos:altLabel> <skos:altLabel xml:lang="en">3DES</skos:altLabel> If you do this, you fail to declare whether 3DES is an abbreviation or an acronym or a synonym or a near-synonym or a common name or a scientific name or etc, but in most applications it is unnecessary to specify what kind of a non-preferred term it is. Just ignore this suggestion If you DO need to pick out those non-preferred terms that are acronyms or abbreviations (or if you cannot accept more than one ordinary non-preferred term). Stella ***************************************************** Stella Dextre Clarke Information Consultant and Chair, ISKO UK Luke House, West Hendred, Wantage, OX12 8RR, UK Tel: 01235-833-298 Fax: 01235-863-298 stella@lukehouse.org ***************************************************** On 24/02/2013 16:27, Antoine Isaac wrote: > Dear Bradley, > > First sorry for the time it took... > > I'm actually not sure to understand the question. Are you searching > for more complex than > > <skos:Concept rdf:about"#:tripleDataEncryptionAlgorithm"> > <skos:prefLabel xml:lang="en">Triple Data Encryption > Algorithm</skos:prefLabel> > <skos:hiddenLabel xml:lang="en">Triple DEZ</skos:hiddenLabel> > <skos:altLabel xml:lang="en">Triple DES</skos:prefLabel> > <skos:notation > rdf:datatype="http://www.w3.org/2001/XMLSchema#string">3DES</skos:notation> > > ? > You can indeed create sub-properties of skos:prefLabel, skos:altLabel, > skos:hiddenLabel and skos:notation for representing the exact "flavor" > or your acronyms and abbreviations, but I'm not sure this is what you > really need, for simple text mining the occurrence of concepts in > documents. > > MADS/RDF offers finer grain. But similarly, I'm not sure you need it... > > Best, > > Antoine > > >> Dear mailing list, >> >> I am trying to build a controlled vocabulary schema to be able to >> model something like RFC 4949 http://tools.ietf.org/html/rfc4949 >> >> This controlled vocabulary has “separate” entries for the acronym, >> abbreviation, each slang/synonym, and canonical term. There are also >> deprecatedLabel. >> >> I do not want separate entries for each acronym/abbreviation as the >> MADs/rdf object properties hasAcronymVariant and >> hasAbbreviationVAriant suggests. Instead I want everything in one >> canonical entry. (reasons outline in Use Case Scenario below) >> >> For example in the RFC 4949, page 9 : >> >> prefLabel: Triple Data Encryption Algorithm >> >> hiddenLabel: Triple DEZ [I made up this slang] >> >> How would you model these 2 alternatives to the canonical Label in >> MADS/rdf? >> >> acronym:3DES >> >> abbreviation: Triple DES >> >> Use Case Scenario >> >> We want to build a master controlled vocabulary by text mining many >> glossaries such as RFC 4949. So we have to be able to process these >> varying labels and cross references. >> >> One approach is to model RFC 4949 using MDS/rdf as the specification >> suggests, and then use a some sort of inferencing/query to get the >> acronyms/abbreviations to “appear” as part of the canonical term >> using object properties. This leads to more term entries but makes it >> easy to text mine. This complicates XSLT transformation to .txt for >> further text mining. >> >> An alternate approach is to make one canonical entry for all label >> types for the text mining reason listed next which would simply the >> XSLT transformation from OWL to .txt >> >> We curate the multiple glossary inputs to ensure there is only one >> canonical idea presented ontologically/conceptually by a SME (either >> manually curate to ensure syntactically different labels for the same >> term are matched or SPARQL query to isolate duplicates or both >> techniques). >> >> Then we export the master term list as a .txt with preferred label, >> acronyms, symbols (QUDT ontology), abbreviations, and synonyms >> (altLabel). This acts as an input again for GATE so that we can text >> mine the true corpus that describes a product to build the knowledge >> base for that product. >> >> Right now our glossary has over 20,000 telecommunications terms (many >> complex and simple labels). So the design is important so we do not >> have a big job correcting populated design errors. >> >> Of course I can just model owl:acronym and owl:abbreviation under the >> approriate imported SKOS, SKOS-XL, and MADS/rdf data properties, but >> I would like to remain as close as possible to customary modeling. >> >> Any thoughts? >> >> *Bradley Shoebottom** >> **Senior Information Architect – Research and Product Development* >> Phone:*(506) 674-5439*| Toll-Free: *(800) 363-3358** >> *Skype:*bradley.shoebottom* >> >> Email:*bradley.shoebottom@innovatia..net >> <mailto:bradley.shoebottom@innovatia.net>* >> > > --
Received on Sunday, 24 February 2013 18:31:02 UTC