- From: Manuel Fiorelli <manuel.fiorelli@gmail.com>
- Date: Fri, 23 Jan 2015 16:50:25 +0100
- To: "John P. McCrae" <jmccrae@cit-ec.uni-bielefeld.de>
- Cc: public-ontolex <public-ontolex@w3.org>, Armando Stellato <stellato@info.uniroma2.it>
- Message-ID: <CAGDmdGiT-KPTg2D1ZuDB7Torz-gc5zQVJHtQKqBEaTVOifs3xg@mail.gmail.com>
Dear John, All see my answers below. 2015-01-23 15:48 GMT+01:00 John P. McCrae <jmccrae@cit-ec.uni-bielefeld.de>: > > > On Fri, Jan 23, 2015 at 3:17 PM, Manuel Fiorelli < > manuel.fiorelli@gmail.com> wrote: > >> Dear John, All >> >> see my answer below. >> >> 2015-01-23 14:59 GMT+01:00 John P. McCrae < >> jmccrae@cit-ec.uni-bielefeld.de>: >> >>> >>> On Fri, Jan 23, 2015 at 2:50 PM, Manuel Fiorelli < >>> manuel.fiorelli@gmail.com> wrote: >>> >>> *7. Properties avgNumOfLexicalization, percentage, lexicalizations no >>> longer on Lexicalization* >>>> >>>> This is something that (if I remember correctly) was still under >>>> discussion. However, in the attached document I was open to the possibility >>>> to include these properties the LexicalizationSet. >>>> >>>> The change you propose would dramatically change the semantics of the >>>> model. Currently, a coverage is only a container of statistics. With your >>>> change in place, a coverage would be a dataset, which contains (I presume) >>>> the lexicalization triples. >>>> >>> OK, I think the important thing is that properties such as >>> lexicalizations can be added to the Lexicalization, it didn't look like >>> that from the diagram >>> >>> As for changing the semantics, I disagree. The lexicalization is not >>> truly a 'dataset' in most cases as it is instead may be published as part >>> of a lexicon (or even part of an ontology). Instead it is a dataset in the >>> sense that it some set of triples, in this case the triples linking an >>> ontology to a lexicon, thus for me a resource coverage is also a dataset, >>> that is the set of triples linking a lexicon to a selection of the >>> ontology's entities by type. >>> >> >> In the model, we have the following axiom >> >> lime:LexicalizationSet rdfs:subClass void:Dataset >> >> therefore, each lexicalizationSet is a dataset, in the sense of being a >> set of triples, i.e. representing the association between ontology entities >> and lexical entries. >> >> As you argue, it may be a subset of another dataset. On this last point, >> maybe we were a bit ambiguous in previous telcos/emails. Suppose that I >> want to distribute an ontolex:Lexicon together with a >> lime:LexicalizationSet, what is the appropriate structure of the data? >> >> a) >> >> >> *The lexicon also contains the triples related to the lexicalizationSet* >> :myLexicon a ontolex:Lexicon . >> :myLexicon void:subset :myLexicalizationSet . >> >> :myLexicalizationSet a lime:LexicalizationSet. >> >> b) >> >> *The lexicon does not contain the triples related to the lexicalization; >> instead, both the lexicon and the lexicalizationSet are part of a larger >> dataset.* >> >> :myDataset a void:Dataset . >> :myDataset void:subset :myLexicon . >> :myDataset void:subset :myLexicalizationSet . >> >> :myLexicon a ontolex:Lexicon . >> :myLexicalizationSet a lime:LexicaliztionSet. >> >> >> I thought that we agreed on the solution b), in order to completely >> remove "semantic" information from the lexicon. What is your position? >> > I think both solutions are in principle fine but would also prefer (b)... > I'm not quite sure about the relevance here. By 'true dataset' I mean a > collection of triples grouped together and made available as a single > download, the semantics of VoID are much weaker making parts of a single > download a dataset as well (although the definition > <http://vocab.deri.ie/void#Dataset> of void:Dataset seems to be a 'true > dataset') > I asked because you wrote "The lexicalization is not truly a 'dataset' in most cases as it is instead may be published as part of a lexicon", thus making me think you were assuming solution a) The following example from the spec clearly allows to define a (sub)set only for the purpose of providing metadata: :DBpedia a void:Dataset; void:classPartition [ void:class foaf:Person; void:entities 312000; ]; void:propertyPartition [ void:property foaf:name; void:triples 312000; ]; . > > For example VoID's classPartition property, which for me is closely > related to lime:coverage, is a subproperty of void:subset, and hence any > class partition is thus a void:Dataset. By the same principle I would say > that the range of lime:coverage is also a void:Dataset as it is also a > partition of the lexicalization. We could even go further and claim > lime:coverage ⊑ void:subset! > > See: > http://www.w3.org/TR/void/#class-property-partitions > http://vocab.deri.ie/void#classPartition > > I see your point. You are suggesting that: *LexicalizationSet* is the dataset containing all the triples related to lexicalization then, by means of *coverage*, you introduce a subset that only concerns with a specific resource type. The object could be something like *ResourceConstrainedLexicalizationSet*. I am sure that this option was already considered and collectively discarded during a telco. Unfortunately, I am not sure about the motivations. Since your proposal seems reasonable, Armando and I will discuss about it on Monday, in order to accept or reject you proposal. In the meantime, I want to highlight another aspect of the model I am not sure. Did we agree on the use of ontolex:languageURI o dcterms:language for languages expressed as resources? -- Manuel Fiorelli
Received on Friday, 23 January 2015 15:50:53 UTC