RE: metadata (and not only): a few discussion points from Armando Stellato on 2015-01-08 (public-ontolex@w3.org from January 2015)

From: Armando Stellato <stellato@info.uniroma2.it>
Date: Thu, 8 Jan 2015 19:49:28 +0100
To: "'John P. McCrae'" <jmccrae@cit-ec.uni-bielefeld.de>
CC: "'public-ontolex'" <public-ontolex@w3.org>
Message-ID: <DUB408-EAS1927D72A6336761B2A2713DA0470@phx.gbl>

Hi John,

thanks for the answer. We try to give some statements there, as we still feel there is need for neat axioms/constraints (or just…have things clear in our minds :D ).

To avoid any confusion in reading the following cases: we make a fundamental assumption in our view:

1. The set of synsets in wordnet *is not* considered as an ontology in the ontology-lexicon dualism. It is rather the semantic backbone of a Lexicon.

2. We suggested thus to call (the whole, not just its synsets) WordNet a ConceptualizedLexicon

1. Model and Terminology Consistency

We report here a few statements/idiosyncrasies that have been made/noticed in the context of our calls. We would suggest to verify them all together and then report them explicitly somewhere, to make things clear from the start.

a. a Lexicon contains only lexical information (no conceptual information, such as synsets)

are we fine with this? In some cases, WordNet (as a whole), which is mostly known as a “lexical database” (correct though maybe too general), has been called also a Lexicon (computational Lexicon). We know the literature can often explode even with terminology misuses, so it’s ok if we decide to keep the above statement in a strict way. Just checking confirmation (this influences other choices). Also it’s important to take in consideration all the modules and where their information belong to (semantic / lexicon part).

This follows from the separation of the semantic and lexical layers that we take as the basis of the group... that is we have the ontology describing the semantics and the lexicon describing the expression of the idea in the words of some language. Of course, the Lexicon is not actually without semantics due to the LexicalSense object, although there is no definition that says that the LexicalSense belongs to the ontology and lexicon. Instead the lexicon is an organization or the ontology-lexicon by entries.

[Armando Stellato]

Ok, we get back to the scenarios we all agreed:

There can be lexicons not thought in advance specifically for an ontology. They exist just because…they exist :-)

If then someone publishes a lexicalization for a given ontology which exploits their lexical content, is this part of the lexicon? We think we agreed this is called Lexicalization and it is disjoint from the Lexicon. Is this right? Because it clashes with the definition you gave above.

Follow-up question (in case what we wrote is correct): in case I want to publish a lexicon for an ontology: is the whole to be considered a dataset which contains a lexicon and a lexicalization, thus:

(

under the axioms that:

ontolex:Lexicon rdfs:subClassOf void:Dataset

)

Is it:

:myDS a void:Dataset .

:myDS void:subset :myLexicon .

:myDS void:subset :myLexicalization .

Or:

:myLexicon a ontolex:Lexicon .

:myLexicon void:subset :myLexicalization .

We vote for the first one, and everything we all already said seems to go into that direction. In any case, these formal examples may help to converge.

b. Lexical/Lexicalized (and then Conceptual/Conceptualized), not only terminology…
During the first year, I (Armando) suggested to introduce a superclass for synset-like things, and suggested to use the name LexicalConcept (used by Miller himself in describing synsets) to represent a common semantic entity for synonymic lexical entries. It is important that we recall a cause/effect distinction. A LexicalConcept is not a domain concept which is being lexicalized (it would be a “lexicalizED concept”), but an entity which exists as a semantic complementary element in the description of a lexicon (whether it is technically part of it or not, see point (a) above). So it is lexical in that is “has to do” with lexical descriptions. A few consequences:

i. ConceptualLexicon

This was the name reported in the minutes to represent Lexicons which have a conceptual backbone (like synsets in wordnet): actually we suggested: ConceptualizedLexicon. This sounds not as an oxymoron (agree with John that ConceptualLexicon does..), and actually tell more about something which is still (purely) a Lexicon. To confirm after vote on (a) if the conceptual backbone is part of the Lexicon or not (and so technically to which dataset the “evokes” triples belong).

This could actually be worth including, but I believe when this was most recently discussed it was noted that the skos:ConceptScheme is functionally the same as a ConceptLexicon and I would rather not duplicate this mechanism, but we should include an example in the spec using ConceptScheme.

[Armando Stellato]

See point 2 of our assumption at the start of the email. We were addressing WordNet (as a whole) as being a ConceptualizedLexicon (thus clarifying that there is a semantic backbone), and this whole cannot be replaced with skos:ConceptScheme (though yes, a skos:ConceptScheme could be used to group the synsets).

To be totally symmetric with the Lexicon-LexicalizationSet-Ontology, we should devise also a:

ConceptualizedLexicon-Conceptualization-LexicalConceptSet

However, a LexicalConceptSet (the set of all synsets in WordNet) makes no sense per se (unless it has to be shared among different resources which is unlikely for synsets exactly for their objective of conceptualizing a specific Lexicon), so we could just drop it and consider it part of a conceptualization.

Finally, if we have no strict need to represent them separately, we could consider actually the whole triad as part of a single dataset, and thus consider everything a ConceptualizedLexicon and report in it all the related properties (e.g. number of LexicalConcepts, avg polysemy etc..), thus simplifying things a lot.

However, we still need at least one element, note in fact that skos:ConceptScheme is not a metadata element, so while it can be ok to group all synsets under a skos:ConceptScheme, still there would be no metadata class to address them.

ii. Use of properties evokes/denotes
We have got the impression during last calls, that ontolex:evokes has been intended to be used whenever a skos:Concept is being described.

Yes, evokes is used for conceptual interpretations of words rather than formal interpretations

[Armando Stellato]

Ok, but a skos:Concept is not ontolex:LexicalConcept…what is the difference then? (see expansion in the answer to the following point)

Actually it is important that domain skos:Concepts in KOSs which are lexicalized through an ontolex:Lexicon fall in the same category as owl:Classes or properties...so to be linked through the ontolex:denotes property.

The compromise that was reached (I don't like this BTW) is that denotes can also refer to LexicalConcepts

[Armando Stellato]

mmm…that is ever worse than expected :-) we really don’t need to have denotes for LexicalConcepts and on the contrary, we are trying to separate them from the rest. Sorry we missed this compromises otherwise we would have said something in advance :-)

Ok, let’s forget for a moment about the terminology (thus considering what objects the names denotes/evokes are more appropriate for). If (and only if) we think that there is some important distinction to make between LexicalConcept (e.g. synsets which, once more, exist only to qualify the meaning of entries in a wordnet and are not ontologies per se) and skos:Concepts in general (e.g. in a lightweight domain model), then there could be a distinction in the adopted properties, such as (again, whatever the names…):

- denotes for skos:Concepts (when used in domain conceptualizations), owl:Classes, owl properties and all the funny company…

- evokes *only* for ontolex:LexicalConcepts

if that is the case (and the names of the properties could be changed) then ok. Otherwise, we should discuss what are the advantages in keeping two properties for separating skos:Concepts (including LexicalConcepts) from the rest.

Sorry there were other points, we will address them in a further email

Cheers,

Armando and Manuel

Received on Thursday, 8 January 2015 18:50:05 UTC