Re: LIME Final Model

OK, one more thing that I think I have not made clear yet. The motivation
for this is that it makes it easier to understand that all properties that
can be stated about a Lexicalization can also be stated about a
LexicalizationCoverage. If one is a subset of the other this is more
obvious and uses one axiom to express what otherwise requires many axioms.

For the language question, we agreed on dcterms:language:
http://www.w3.org/2014/10/17-ontolex-minutes.html

Regards,
John

On Fri, Jan 23, 2015 at 4:50 PM, Manuel Fiorelli <manuel.fiorelli@gmail.com>
wrote:

> Dear John, All
>
> see my answers below.
>
> 2015-01-23 15:48 GMT+01:00 John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de
> >:
>
>>
>>
>> On Fri, Jan 23, 2015 at 3:17 PM, Manuel Fiorelli <
>> manuel.fiorelli@gmail.com> wrote:
>>
>>> Dear John, All
>>>
>>> see my answer below.
>>>
>>> 2015-01-23 14:59 GMT+01:00 John P. McCrae <
>>> jmccrae@cit-ec.uni-bielefeld.de>:
>>>
>>>>
>>>> On Fri, Jan 23, 2015 at 2:50 PM, Manuel Fiorelli <
>>>> manuel.fiorelli@gmail.com> wrote:
>>>>
>>>> *7. Properties avgNumOfLexicalization, percentage, lexicalizations no
>>>> longer on Lexicalization*
>>>>>
>>>>> This is something that (if I remember correctly) was still under
>>>>> discussion. However, in the attached document I was open to the possibility
>>>>> to include these properties the LexicalizationSet.
>>>>>
>>>>> The change you propose would dramatically change the semantics of the
>>>>> model. Currently, a coverage is only a container of statistics. With your
>>>>> change in place, a coverage would be a dataset, which contains (I presume)
>>>>> the lexicalization triples.
>>>>>
>>>> OK, I think the important thing is that properties such as
>>>> lexicalizations can be added to the Lexicalization, it didn't look like
>>>> that from the diagram
>>>>
>>>> As for changing the semantics, I disagree. The lexicalization is not
>>>> truly a 'dataset' in most cases as it is instead may be published as part
>>>> of a lexicon (or even part of an ontology). Instead it is a dataset in the
>>>> sense that it some set of triples, in this case the triples linking an
>>>> ontology to a lexicon, thus for me a resource coverage is also a dataset,
>>>> that is the set of triples linking a lexicon to a selection of the
>>>> ontology's entities by type.
>>>>
>>>
>>> In the model, we have the following axiom
>>>
>>> lime:LexicalizationSet rdfs:subClass void:Dataset
>>>
>>> therefore, each lexicalizationSet is a dataset, in the sense of being a
>>> set of triples, i.e. representing the association between ontology entities
>>> and lexical entries.
>>>
>>> As you argue, it may be a subset of another dataset. On this last point,
>>> maybe we were a bit ambiguous in previous telcos/emails. Suppose that I
>>> want to distribute an ontolex:Lexicon together with a
>>> lime:LexicalizationSet, what is the appropriate structure of the data?
>>>
>>> a)
>>>
>>>
>>> *The lexicon also contains the triples related to the lexicalizationSet*
>>> :myLexicon a ontolex:Lexicon .
>>> :myLexicon void:subset :myLexicalizationSet .
>>>
>>> :myLexicalizationSet a lime:LexicalizationSet.
>>>
>>> b)
>>>
>>> *The lexicon does not contain the triples related to the lexicalization;
>>> instead, both the lexicon and the lexicalizationSet are part of a larger
>>> dataset.*
>>>
>>> :myDataset a void:Dataset .
>>> :myDataset void:subset :myLexicon .
>>> :myDataset void:subset :myLexicalizationSet .
>>>
>>> :myLexicon a ontolex:Lexicon .
>>> :myLexicalizationSet a lime:LexicaliztionSet.
>>>
>>>
>>> I thought that we agreed on the solution b), in order to completely
>>> remove "semantic" information from the lexicon. What is your position?
>>>
>> I think both solutions are in principle fine but would also prefer (b)...
>> I'm not quite sure about the relevance here. By 'true dataset' I mean a
>> collection of triples grouped together and made available as a single
>> download, the semantics of VoID are much weaker making parts of a single
>> download a dataset as well (although the definition
>> <http://vocab.deri.ie/void#Dataset> of void:Dataset seems to be a 'true
>> dataset')
>>
>
> I asked because you wrote "The lexicalization is not truly a 'dataset' in
> most cases as it is instead may be published as part of a lexicon", thus
> making me think you were assuming solution a)
>
> The following example from the spec clearly allows to define a (sub)set
> only for the purpose of providing metadata:
>
> :DBpedia a void:Dataset;
>     void:classPartition [
>         void:class foaf:Person;
>         void:entities 312000;
>     ];
>     void:propertyPartition [
>         void:property foaf:name;
>         void:triples 312000;
>     ];
>     .
>
>
>
>>
>> For example VoID's classPartition property, which for me is closely
>> related to lime:coverage, is a subproperty of void:subset, and hence any
>> class partition is thus a void:Dataset. By the same principle I would say
>> that the range of lime:coverage is also a void:Dataset as it is also a
>> partition of the lexicalization. We could even go further and claim
>> lime:coverage ⊑ void:subset!
>>
>> See:
>> http://www.w3.org/TR/void/#class-property-partitions
>> http://vocab.deri.ie/void#classPartition
>>
>>
> I see your point. You are suggesting that:
>
> *LexicalizationSet* is the dataset containing all the triples related to
> lexicalization
> then, by means of *coverage*, you introduce a subset that only concerns
> with a specific resource type. The object could be something like
> *ResourceConstrainedLexicalizationSet*.
>
> I am sure that this option was already considered and collectively
> discarded during a telco. Unfortunately, I am not sure about the
> motivations.
>
> Since your proposal seems reasonable, Armando and I will discuss about it
> on Monday, in order to accept or reject you proposal.
>
> In the meantime, I want to highlight another aspect of the model I am not
> sure. Did we agree on the use of ontolex:languageURI o dcterms:language for
> languages expressed as resources?
>
> --
> Manuel Fiorelli
>

Received on Friday, 23 January 2015 15:57:13 UTC