RE: about Lexical Linked Data and notable lexical resources from Armando Stellato on 2013-09-14 (public-ontolex@w3.org from September 2013)

From: Armando Stellato <stellato@info.uniroma2.it>
Date: Sat, 14 Sep 2013 13:48:36 +0200
To: "'Aldo Gangemi'" <aldo.gangemi@cnr.it>
Cc: <public-ontolex@w3.org>
Message-ID: <004101ceb140$59d52fd0$0d7f8f70$@info.uniroma2.it>
Hi Aldo,

 

Boundary-conditions always affect decisions, and sometimes may even change
their outcome radically.

Linked Data practices are well-known to everybody, and are mostly inherited
from concepts of re-use (and related ones) of Software Engineering. And, to
make an example, in software development, while imported software projects
are usually (and obviously) "read-only", no SE best practice will prevent
you from changing a project which is imported by the one you are working on,
if that project is under your control too and if this change positively
affects a whole range of things being published.

 

So, leaving theory we all agree about behind, and going ground to our case:
obviously I wouldn't contact anybody to change FOAF and ask it to become
Lemon-compliant, nor to publish a separate Lemon-enrichment of FOAF :-)  

But here, we are dealing with Lexical Resources: which have not to be
enriched *with* Lemon descriptions, but can be described *through* Lemon.
These lexical resources had a story, the current WordNet comes after at
least one other attempt no? I think we can say the same for Framenet. So,
while we don't want to delete anything already said about them, I was just
thinking if (also considering your presence in Lemon, and being at the same
time one of the authors of that RDF WordNet) we could think of a new version
for them (keeping their story, and maybe adding more descriptors).

 

You may remember I said that the current WordNet was perfect, though, in the
context of shared vocabularies, I felt it was an ontology like any other
domain ontologies+data, with its specific model. Yes, it was in OWL1, but I
really felt the lack of a modeling-umbrella provided by W3C. That is, I
would think of tools telling: "oh yes! This is a lexical resource! Let's
show it this way!". That is also the motivation behind SKOS for concept
schemes: anybody could create generic OWL instances for their "concepts" and
create their own "myont:narrower"/broader, but SKOS provided the shared
modeling vocabulary for that, so that SKOS-enabled tools could provide
dedicated views.

 

So, am I saying WordNet was released too early?

No, considering that a notable resource like that couldn't stay out of RDF
for too long, so good to have a shared OWL based representation for it. So,
absolutely ok for that time!

Yes, if we consider that it could have put under a more explicit category of
Lexical Resources (providing that this category was made explicit through
some W3C language).

 

So, my question was rightly in the above perspective. Since this discussion
started mentioning theory and best practices, let me bring another example:
the kind of top-to-down linking (connecting a model over an existing
instance which has been represented after another model) we would make by
leaving such resources as-they-are, is what software engineers call
"Adapter" pattern. And Adapter patterns are just an "escamotage" for when
you need to include objects which you really can't touch. If you have any
control over those objects, much better (when you write your new version of
the software) to make them directly compatible with the new model (and
leaving maybe those adapters for outdated versions).

 

So, I think the effort would be paid back enormously in the years, by a
cleaner and clearer descriptions of these resources.

 

This is my position, but my original question was not pleonastic: even if
agreeing in changing the resources, there are whens and wheres to discuss.
For instance, to change a resource already under the W3C umbrella, I would
expect to see Lemon under the W3C umbrella as well, and in approved status.

 

Cheers,

 

Armando

 

1.       maybe SKOS with some OWL axiomatizations would have been used if
SKOS was more mature at the time? I see there is an open section here:
<http://www.w3.org/TR/wordnet-rdf/#skos>
http://www.w3.org/TR/wordnet-rdf/#skos

 

 

 

 

 

 

From: Aldo Gangemi [mailto:aldo.gangemi@cnr.it] 
Sent: Friday, September 13, 2013 5:04 PM
To: Armando Stellato
Cc: Aldo Gangemi; public-ontolex@w3.org
Subject: Re: about Lexical Linked Data and notable lexical resources

 

Just stick to Linked Data practices. When creating new data or vocabularies,
link them to existing ones, do not try to substitute them!

The pro-change arguments are bad practice: the bare fact of having a new
vocabulary for interoperability does not mean that everything made in the
past needs to be refactorised ;)

Re: exposition of OntoLex, sure it's important, but it can be exposed much
better (and with less effort) by cross-linking resources, which will show
its intended capabilities.

Concerning WordNet versions, a 3.0 port already exists from VU Amsterdam,
and they use the same vocabulary from 2.0.

For the name, there are good reasons in both groups, and it's hard to take a
final decision. An alternative solution (that I like) is to extract a real
core vocabulary with only basic semiotic distinctions, and call it OntoLex,
while the Lemon legacy (mostly about more specific properties and classes)
may still be called Lemon.

Aldo

 

On Sep 13, 2013, at 4:39:16 PM , "Armando Stellato" <
<mailto:stellato@info.uniroma2.it> stellato@info.uniroma2.it> wrote:





Hi all,

 

Today, following the finalization of the core module, the agenda was quite
dynamic, with some core points (e.g. name of the vocabulary, assessing the
list of modules to be developed etc..) and time for raising new issues or
retrieving open ones.

 

I remember that one thing left appended (or at least, I recall hearing
different opinions on the matter) was:

 

"notable resources, like WordNet and FrameNet": should we only map them, or
should we make a new version, possibly with direct references to the
OntoLex1 vocabulary?

 

There are pro and contra for both approaches.

-          PRO-Keep: Solidity of the existing versions: for instance, as
Aldo says about Wordnet: "it's already out there under the W3C umbrella", so
better not to change anything carved into stone (though it's still in
working draft am I right?)

-          PRO-Change: Promotion of OntoLex: one thing is to have a mapping
module telling how WordNet/Framenet is seen from an Ontolex perspective, one
thing is to express WordNet/Framenet in terms of Ontolex directly inside the
WordNet/Framenet resource. People accessing these resources will know about
Ontolex.

-          PRO-Keep: even from a barely terminological point of view, it is
important to keep the original descriptors from these resources and
transpose them into RDF. e.g. a WordNet synset is a kind of
ontolex:LexicalConcept; ontolex:LexicalConcept guarantess interoperability
and tells how to "attach" wordnet descriptors to ontology by using ontolex
vocabulary, nonetheless, it is important to provide a "Synset" class
specific to WordNet to make it clear to WordNet users what those objects are
by means of the WordNet terminology.

-          PRO-Change: despite the above is true, still whenever possible, a
direct use of Ontolex constructs would make resources more explicitly
"connected" and their reuse more "evident" in terms of Ontolex principles

-          .and so on.

 

Now, we had interesting feedback from Piek Vossen (EuroWordNet) and other
people involved in the development of these resources, and also comments and
wishes about changes wrt existing versions (I recall one about URI namings
for WordNet synsets, for instance).

 

I imagine also that there are viable "grey" hypotheses, laying in the
middle, like providing new versions of these resources (e.g. a WordNet 3.0
RDF porting, whereas the current one is 2.0) which, while not "breaking" any
already-defined construct, could still embed these mappings to OntoLex,
bring some strongly required change, etc., while not "betraying" its
original nature (so, just an update and not a rewriting.)

 

So, the question is:

 

Leave everything as it is, and provide mappings in our OntoLex world, or try
to maximize the awareness of Ontolex directly "at the source of the
resources", possibly (re)working them as much as necessary ?

.or any anything in the middle.case by case.

 

Waiting for feedback :)

 

Armando

 

P.S: I made all examples about WordNet, as I'm not proficient enough for
providing detailed examples about FrameNet, VerbNet, or other pertinent
resources..

 

1.       I used the name OntoLex, but it's just the fairest neutral name I
thought to use here while waiting for the final one; there is still a final
decision to be summed up from our votes about the naming of our model: be it
Ontolex, Lemon2 or whatever else..

 

 

--------------------------------------------------

 

Ing. Armando Stellato, PhD

AI Research Group,

Dept. of Enterprise Engineering

University of Roma, Tor Vergata

Via del Politecnico 1 00133 ROMA (ITALY)

tel: +39 06 7259 7330 (office, room A1-14);

     +39 06 7259 7332 (lab)

fax: +39 06 7259 7460

e_mail:  <mailto:stellato@info.uniroma2.it> stellato@info.uniroma2.it

 

--------------------------------------------------
Received on Saturday, 14 September 2013 11:49:12 UTC