Re: R: Ontolex/Lime: minutes of last meetings and some updates from Philipp Cimiano on 2014-07-18 (public-ontolex@w3.org from July 2014)

From: Philipp Cimiano <cimiano@cit-ec.uni-bielefeld.de>
Date: Fri, 18 Jul 2014 14:16:30 +0200
To: public-ontolex@w3.org
Message-ID: <53C9101E.1040408@cit-ec.uni-bielefeld.de>
Hi John, Armando, all,

  so far I have nothing to contribute to the partition discussion.

As far as I unterstand my proposal for "LexicalizationSet" containing a 
set of lex triples for one language, ontology/dataset and different 
lexica is accepted? ;-)

I think you are mainly discussing the partition property in 
ResourceCoverage right? I have no strong opinion on how to call this 
partition propery. I leave it to John and Armano to agree on sth.

Concerning the discussion on the property linking a lexicon to the 
dataset/ontology/vocabulary, I propose after all discussion to use the 
more neutral:

lexicalizationTarget  with Domain LexicalizationSet (as a conceptual 
subset of the lexicon) and Range void:Dataset.

As Armando rightfully argues, void:Dataset subsumes ontologies, 
vocabularies, and anyhting that might come to our mind ;-)

To maximize reuse of our model by LOD folks, we should not insist in 
ontology being part of the label of that property as it sounds more 
restrictive than it really is.

So what about lexicalizationTarget ?

Talk to you later,

Philipp.

Am 15.07.14 12:52, schrieb Armando Stellato:
>
> Dear John,
>
>     So, summing up, in this lot of chaos, at least for historical
>     reasons, I would say the name ontolex is sacred :-) but when it
>     comes to a specific property, I would rather be stick with using a
>     somewhat approved and shared terminology. I’m not dogmatic into
>     this neither…just more convinced.
>
> Just for the elegance of it, I would like to see some kind of symmetry 
> between the properties currently called "lexicon" and 
> "targetDataset"... perhaps something like 
> "lexicalizedDataset"/"ontologicalDataset" or "semanticDataset". I 
> think targetDataset sounds a bit too bland.
>
> *//*
>
> */[Armando Stellato] /*
>
> Agree, that was the original intention, to have lexicalizedDataset 
> (original name) and lexicon. It is not totally symmetric but, on the 
> other side, renders the direction: you use a lexicon to lexicalize 
> something. However, it seemed too heavy to Philipp (if I remember 
> well). targetDataset was bland, though maybe that was we were looking for.
>
> In principle, I like your suggestions (and same came to me), but I try 
> to go by exclusion:
>
> -ontologicalDataset: that would seem even more an ontology, because a 
> dataset can be an ontology vocabulary or data, and writing 
> ontologicalDataset would seem to disambiguate it by telling: “ehy 
> guys, this is a vocabulary, no data!”.
>
> -semanticDataset: thought about it too, seemed a good compromise 
> between being sufficiently general without being bland. However…if the 
> Lexicon is WordNet, would you say it is not a dataset without 
> semantics? Is thus semanticDataset a real discriminant?
>
> -lexicalizedDataset: our original proposal, ok to come back to it if 
> it pleases the others (again, don’t recall Philipp’s preference on this)
>
> -I’m open to others, and will think about other possibilities
>
> So, for the moment, having no better name, thought that expressing the 
> target of a lexicalization as targetDataset was…exactly, though 
> implicitly, what we want to describe.
>
> Also, we should attempt to avoid names that are the same (up to 
> capitalization) between models, and as there is already an 
> ontolex:Lexicon class, we should avoid a lime:lexicon property (to 
> avoid confusion).
>
> *//*
>
> */[Armando Stellato] /*
>
> mmm…now you raise up another aspect: I was contrary (I said that from 
> the very start) about this late introduction of Lexicon in the core. 
> To me, it should be something in the metadata only ( as we initially 
> depicted), intended as the same kind of proxy available in void for 
> datasets etc.. (a subclass of Dataset actually). There are no big 
> predecessors in other vocabularies about this, except for the core 
> modeling vocabulary OWL (and partially).  Owl ontologies are declared 
> as owl:ontology, and yes in SKOS you have concept schemes (but mostly 
> because users wanted to have multiple schemes, and, btw, schemes are 
> the most flawy and controversial part of SKOS), but for the rest, 
> usually vocabularies do not allow datasets to utter themselves as 
> containers of some specific kind for their content (e.g. Lexicon), 
> they just…allow to model it.
>
> This is maybe because as you yourself said, things may easily mixup in 
> data, thus metadata can logically refer to partitions of available 
> data, but it’s not of a big purpose to declare it in the data.
>
>     VoID here and call the object a "partition"?
>
>     *//*
>
>     */[Armando Stellato] /*
>
>     I wouldn’t, because in VoID you may address a partition as a whole
>     new dataset description, just addressing the fact that its content
>     is a partition of another one (in general, the main dataset being
>     described in the file). In lime, the focus is not on the partition
>     itself (we don’t add any more descriptions about it), but on how
>     that certain partition has been lexicalized.
>
> But it is still a partition of the lexicalization that we are 
> describing, right? It is the part that only refers to 
> classes/properties/etc.
>
> *//*
>
> */[Armando Stellato] /*
>
> Yes, it is, but again, the focus is not on generating and describing a 
> partition. Also, your last comment (“It is the part that only refers 
> to classes/properties/etc.”) makes me guess where the issue is: it is 
> not **only** refers to them. I in fact expect that the most used 
> partition would be the whole set of entities (thus: class = 
> rdfs:Resource), as we did not foresee a different way to represent 
> “the whole”. Just, “the whole”, is a partition like all the others, 
> except it includes everything.
>
> Other solution would be to address it as a dataset (and a partition is 
> a dataset itself), in that case one would declare partitions of the 
> lexicalizedDataset, and then point to them through this property. The 
> only odd thing here is that we would have the property (in this 
> coverage construct) to point back to the dataset (in the whole case), 
> whereas the dataset has been already mentioned in the main 
> LexicalizationSet construct.
>
> Here’s an example (I use here the updated LexicalizationSet name, 
> after Philipp’s suggestion):
>
> myItLex:myItalianLexicalizationOfDat
>
>   a lime:LexicalizationSet;
>
>   dc/lime:lang "it";
>
>   lime:lexicalizedDataset :dat ;
>
>   lime:lexicalModel ontolex: ;
>   lime:lexicon :italianWordnet;
>   lime:resourceCoverage [
>     lime:class owl:Class;
> 1]  lime:partition :dat // what should we do here to address the 
> whole? Repeat again the :dat dataset in case of the whole?
> 2]  lime:dataset :dat // or at this point, even reuse a dataset property?
>     lime:percentage …;
>     lime:avgNumOfEntries …
>   ].
>
> Also, in case of a real partition (I mean, a strictly contained subset 
> of the dataset), if you want to be really aligned with void, then you 
> should create a partition entity, which would complexify the thing). I 
> think it is not necessary as here it’s implicit that the dataset is 
> the one addressed with lime:lexicalizedDataset, outside of the 
> resourceCoverage structure.
>
> Cheers,
>
> Armando
>

-- 
--
Prof. Dr. Philipp Cimiano
AG Semantic Computing
Exzellenzcluster für Cognitive Interaction Technology (CITEC)
Universität Bielefeld

Tel: +49 521 106 12249
Fax: +49 521 106 6560
Mail: cimiano@cit-ec.uni-bielefeld.de

Office CITEC-2.307
Universitätsstr. 21-25
33615 Bielefeld, NRW
Germany
Received on Friday, 18 July 2014 12:16:59 UTC