- From: John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de>
- Date: Fri, 23 Jan 2015 16:56:45 +0100
- To: Manuel Fiorelli <manuel.fiorelli@gmail.com>
- Cc: public-ontolex <public-ontolex@w3.org>, Armando Stellato <stellato@info.uniroma2.it>
- Message-ID: <CAC5njqphb4AOKzKxucd6jfFQD_3Y7A7qzpgdC86OVwWxC4_CZg@mail.gmail.com>
OK, one more thing that I think I have not made clear yet. The motivation for this is that it makes it easier to understand that all properties that can be stated about a Lexicalization can also be stated about a LexicalizationCoverage. If one is a subset of the other this is more obvious and uses one axiom to express what otherwise requires many axioms. For the language question, we agreed on dcterms:language: http://www.w3.org/2014/10/17-ontolex-minutes.html Regards, John On Fri, Jan 23, 2015 at 4:50 PM, Manuel Fiorelli <manuel.fiorelli@gmail.com> wrote: > Dear John, All > > see my answers below. > > 2015-01-23 15:48 GMT+01:00 John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de > >: > >> >> >> On Fri, Jan 23, 2015 at 3:17 PM, Manuel Fiorelli < >> manuel.fiorelli@gmail.com> wrote: >> >>> Dear John, All >>> >>> see my answer below. >>> >>> 2015-01-23 14:59 GMT+01:00 John P. McCrae < >>> jmccrae@cit-ec.uni-bielefeld.de>: >>> >>>> >>>> On Fri, Jan 23, 2015 at 2:50 PM, Manuel Fiorelli < >>>> manuel.fiorelli@gmail.com> wrote: >>>> >>>> *7. Properties avgNumOfLexicalization, percentage, lexicalizations no >>>> longer on Lexicalization* >>>>> >>>>> This is something that (if I remember correctly) was still under >>>>> discussion. However, in the attached document I was open to the possibility >>>>> to include these properties the LexicalizationSet. >>>>> >>>>> The change you propose would dramatically change the semantics of the >>>>> model. Currently, a coverage is only a container of statistics. With your >>>>> change in place, a coverage would be a dataset, which contains (I presume) >>>>> the lexicalization triples. >>>>> >>>> OK, I think the important thing is that properties such as >>>> lexicalizations can be added to the Lexicalization, it didn't look like >>>> that from the diagram >>>> >>>> As for changing the semantics, I disagree. The lexicalization is not >>>> truly a 'dataset' in most cases as it is instead may be published as part >>>> of a lexicon (or even part of an ontology). Instead it is a dataset in the >>>> sense that it some set of triples, in this case the triples linking an >>>> ontology to a lexicon, thus for me a resource coverage is also a dataset, >>>> that is the set of triples linking a lexicon to a selection of the >>>> ontology's entities by type. >>>> >>> >>> In the model, we have the following axiom >>> >>> lime:LexicalizationSet rdfs:subClass void:Dataset >>> >>> therefore, each lexicalizationSet is a dataset, in the sense of being a >>> set of triples, i.e. representing the association between ontology entities >>> and lexical entries. >>> >>> As you argue, it may be a subset of another dataset. On this last point, >>> maybe we were a bit ambiguous in previous telcos/emails. Suppose that I >>> want to distribute an ontolex:Lexicon together with a >>> lime:LexicalizationSet, what is the appropriate structure of the data? >>> >>> a) >>> >>> >>> *The lexicon also contains the triples related to the lexicalizationSet* >>> :myLexicon a ontolex:Lexicon . >>> :myLexicon void:subset :myLexicalizationSet . >>> >>> :myLexicalizationSet a lime:LexicalizationSet. >>> >>> b) >>> >>> *The lexicon does not contain the triples related to the lexicalization; >>> instead, both the lexicon and the lexicalizationSet are part of a larger >>> dataset.* >>> >>> :myDataset a void:Dataset . >>> :myDataset void:subset :myLexicon . >>> :myDataset void:subset :myLexicalizationSet . >>> >>> :myLexicon a ontolex:Lexicon . >>> :myLexicalizationSet a lime:LexicaliztionSet. >>> >>> >>> I thought that we agreed on the solution b), in order to completely >>> remove "semantic" information from the lexicon. What is your position? >>> >> I think both solutions are in principle fine but would also prefer (b)... >> I'm not quite sure about the relevance here. By 'true dataset' I mean a >> collection of triples grouped together and made available as a single >> download, the semantics of VoID are much weaker making parts of a single >> download a dataset as well (although the definition >> <http://vocab.deri.ie/void#Dataset> of void:Dataset seems to be a 'true >> dataset') >> > > I asked because you wrote "The lexicalization is not truly a 'dataset' in > most cases as it is instead may be published as part of a lexicon", thus > making me think you were assuming solution a) > > The following example from the spec clearly allows to define a (sub)set > only for the purpose of providing metadata: > > :DBpedia a void:Dataset; > void:classPartition [ > void:class foaf:Person; > void:entities 312000; > ]; > void:propertyPartition [ > void:property foaf:name; > void:triples 312000; > ]; > . > > > >> >> For example VoID's classPartition property, which for me is closely >> related to lime:coverage, is a subproperty of void:subset, and hence any >> class partition is thus a void:Dataset. By the same principle I would say >> that the range of lime:coverage is also a void:Dataset as it is also a >> partition of the lexicalization. We could even go further and claim >> lime:coverage ⊑ void:subset! >> >> See: >> http://www.w3.org/TR/void/#class-property-partitions >> http://vocab.deri.ie/void#classPartition >> >> > I see your point. You are suggesting that: > > *LexicalizationSet* is the dataset containing all the triples related to > lexicalization > then, by means of *coverage*, you introduce a subset that only concerns > with a specific resource type. The object could be something like > *ResourceConstrainedLexicalizationSet*. > > I am sure that this option was already considered and collectively > discarded during a telco. Unfortunately, I am not sure about the > motivations. > > Since your proposal seems reasonable, Armando and I will discuss about it > on Monday, in order to accept or reject you proposal. > > In the meantime, I want to highlight another aspect of the model I am not > sure. Did we agree on the use of ontolex:languageURI o dcterms:language for > languages expressed as resources? > > -- > Manuel Fiorelli >
Received on Friday, 23 January 2015 15:57:13 UTC