RE: Ontolex/Lime: minutes of last meetings and some updates

Dear John,

 

thanks for your feedback. I will reply in details later (I’m in a meeting now), but, by reading your note about ambiguity, I just noticed I forgot one important point:

 

I also mentioned the ambiguity to be solved among Lexicalization (as a dataset) and lexicalizations (as binding triples between lexicon and ontology). Philipp suggested, very simple but effective, to use LexicalizationSet for the dataset. I like it.

 

…and yes, your note about ambiguity was about another aspect, but it made me notice this thing I forgot, so I thought was important to add it to the minutes :)

 

Be back on your email soon!

 

Armando

 

 

 

From: johnmccrae@gmail.com [mailto:johnmccrae@gmail.com] On Behalf Of John P. McCrae
Sent: Thursday, July 10, 2014 11:38 AM
To: Armando Stellato; Armando Stellato
Cc: public-ontolex; public-ontolex
Subject: Re: Ontolex/Lime: minutes of last meetings and some updates

 

 

 

On Fri, Jul 4, 2014 at 4:42 PM, Armando Stellato <stellato@info.uniroma2.it <mailto:stellato@info.uniroma2.it> > wrote:

Dear all,

 

since we’ll be working via email in these weeks, just a quick minute of what has been said at the last call (in particular, what is pending decision), and then one updates and…request for opinions on open aspects:

 

As a general observation, we are at a good point. In the last call wee agreed on the overall structure, and also agreed on which part of the terminology can be improved.

 

As for the last time, instead of presenting the model, I present a small example of its use, as it is shorter to be shown and more intuitive to be followed:

 

/** inside the void file of the Lexicalization

myItLex:myItalianLexicalizationOfDat

  a lime:Lexicalization;

  lime:lang "it";  // important to be here, this is the focus of search by agents!!! Not the lexicon!

  lime:lexicalizedDataset :dat ;

  lime:lexicalModel ontolex: ;  
  lime:lexicon :italianWordnet;
  lime:resourceCoverage [   // see discussion later in sections 5
    lime:class owl:Class;
    lime:percentage …;
    lime:avgNumOfEntries …
  ].

 

We already agreed in previous calls to leave aside discussion on the percentages/averages vs counts as the last thing so, obviously, these two properties:

    lime:percentage …;
    lime:avgNumOfEntries …

may change also depending on which values they will host.

 

lime:lang has already been agreed which can be replaced with some ontolex:lang. Actually, the general trend is to reinvent a lang property (exactly, by changing only the namespace) for each vocabulary, so to identify its specific use. So, for instance, dcat has its own one, with its dedicated domain and range, and so we could, by setting up domain of lime:lang to lime:Lexicalizaton. Apart from that, I’ve no strong objection against reusing another one.

One of the specific issues here is that it would be good to have an "ontolex-all" ontology, and thus we should avoid any inter-module name classes. Perhaps though the solution is to use the Dublin Core property and add appropriate axioms to the definition of Lexicalization/Lexicon, (Lexicalization ⊑ ∃ dc:language.String)

 

lime:lexicalizedDataset: we more or less agreed on its name, providing that the term Dataset was proven to be including ontology vocabularies. In the meanwhile I did check on some mailing lists, and the reply from Richard Cyganiak (one of the authors of void) is affirmative: Dataset does include ontology vocabularies. This is his reply on the LOD ml: http://lists.w3.org/Archives/Public/public-lod/2014Jul/0012.html

Note: I think there was also a proposal (maybe from Philipp) to use targetDataset. Not sure which one won, however, targetDataset is for me fine as well: more, if we have a Lexicalization, it *almost* immediately follows that its target is the dataset to be lexicalized, so maybe even nicer to use targetDataset. The only formal opposition to that would be that a Lexicon is a dataset too, and a lexicalization exactly binds a Lexicon and a Dataset to be lexicalized, so targetDataset would be slightly ambiguous.

My principal concern is the ambiguity as lexicons are also datasets... the name of the group is OntoLex what is the problem of not just use the term ontology to refer to what we are lexicalizing (even if some of the targets may not be true ontologies)?

 

lime:resourceCoverage: we agreed on its structure: it allows to factorize all the elements of a lexicalization in a single point (the Lexicalization object) and then have multiple partitions identified by it. However, we also agree that we may try to look for a better name :-) Suggestions?

Actually this may be depending on that final decision on percentages/averages vs counts. resourceCoverage is evoked in my mind (though may be changed as well) if, like in this example, we have percentages/averages. With counts, I would be ever more tempted to look for something else.

Shall we not follow VoID here and call the object a "partition"? 

 

Oh, one last thing, which was left over from discussion: LexicalLinkSets.

I get back an example from a previous email: suppose that I’m (implicitly) lexicalizing an ontology by writing links between LexicalConcepts of WordNet (synsets) and the resources of the ontology. We thus have links between semantic entities on both sides (Lexicon and Dataset) so this cannot be expressed through a Lexicalization object (unless we want to count the non-OWL inferable lexical derivations of this semantic linking). So, we have the properties in ontolex core for that and I assume thus this is relevant for our model, and then probably it would be important to tell it somehow in the metadata. That’s where I suggested this lime:LexicalLinkSet as a subclass of void:LinkSet.

 

Think that’s all,

 

Armando

 

 

 

Received on Thursday, 10 July 2014 09:50:38 UTC