[WNET] RDF/OWL versions of WN

Hi,

In light of comments of Brian and Guus I had another look at how to
reconcile RDF and OWL for WordNet.

I worked out a solution in which there is one RDF data file and one 
RDF/OWL schema which is understandable for both RDFS and OWL tools as 
much as possible. I created a new schema in [3], didn't create the RDF 
data itself, but will do so if this solution receives positive feedback.

Another solution is to have two separate versions:
- RDF data + RDF Schema
- RDF data + OWL schema

I myself am leaning towards the opinion that it's better to have two 
seperate versions, because things get a bit messy, please comment if 
you think otherwise or agree.

An orhogonal issue is that I think a few of the classes in Aldo's 
current proposal [2] do not really belong to the core WN datamodel. 
Therefore, I moved them to a separate "extensions" file [4] and not to 
[3].


Cheers,
Mark.

--------------------------------------------------------------------------------

There are three problems in the context of WN's OWL model:

1) RDFS tools don't understand owl:Class, owl:ObjectProperty and
    owl:DatatypeProperty (these are crucial, other statements e.g.
    owl:disjointWith and owl:TransitiveProperty can be ignored
    without missing something that the tools can understand);

Problem 1 can be solved by these additions to Aldo's schema (older 
version in
[1], newer version in [2]):

- each class gets an rdf:type link not only to owl:Class but also to
   rdfs:Class

- each property gets an rdf:type link not only to either
   owl:DatatypeProperty or owl:ObjectProperty but also to rdf:Property.

I added to [2] these statements resulting in the file [3].

--------------------------------------------------------------------------------

2) RDFS requires explicit rdf:type links for all instances of a class
    and explicit rdfs:subClassOf links between subclasses.

    OWL can use completely defined classes (owl:EquivalentClass) to
    implicitly define rdf:type links (instance classification). Some
    logical superclasses of a complete class can be inferred also
    (rdfs:subClassOf links).

    Partially defined OWL classes with restrictions can also be
    automatically classified below logical superclasses
    (rdfs:subClassOf links). Instance classification however does not
     occur.

Problem 2 can be solved by having two separate versions:
- RDF1 + RDF Schema
- RDF2 + OWL Schema

In the RDF(S) version, all links that are inferred in the OWL version 
are added explicitly. I.e. the data in file RDF1 contains the explicit
links while the file RDF2 does not.

A second solution is to combine the two schemas in such a way that 
RDFS tools can understand them as much as possible, and have an RDF 
file that has all the links explicitly. For my attempt at such a 
combined schema, see [3]. Fortunately, Aldo already added explicit 
subclass links, so that's not a problem.


--------------------------------------------------------------------------------

3) RDFS tools miss out on domain and range info that is specified in
    local OWL class restrictions. The same for inverse properties. They
    also cannot handle domain/range defs with owl:unionOf.


A third problem is domain and range info that is defined in local 
restrictions. This is not available to RDFS tools. However, Aldo 
provided global domains and ranges for all props, so RDFS tools are 
not clueless, i.e. get the info that they are able to understand.

Then there still is a small set of WN properties that creates problems 
with domains/ranges, namely:

- props that are owl:inverseOf another prop
- wn:attributeOf (none defined)
- wn:seeAlso (unions in dom/range)
- wn:adjectivePertainsTo (union in range)

Because RDFS tools do not understand owl:inverseOf, they will lack the 
right domain/range defs for inverse properties because they are 
generally not stated in an OWL file but inferred. In general this can 
be solved by explicitly adding the domains and ranges for inverse 
props. Aldo already did so for WN's inverse properties
(but forgot for attributeOf, so I added that), so the inverse props 
are taken care of.

Unions in domains/ranges is more tricky. A partial solution would be 
to add triples like these (example range def for "propX"):

- AorB rdf:type rdfs:Class
- propX rdfs:range AorB
- AorB owl:unionOf (A, B)
- A rdfs:subClassOf AorB
- B rdfs:subClassOf AorB

An RDFS tool thus can use AorB for inferring the type of an instance 
in the range of propX and can also use AorB for querying.
I did not implement the solution I state above in [3], I would like 
some feedback on this first.

An alternative would be to just leave it like it is, i.e. not having
dom/ranges available for these inverse props in RDFS tools. What 
important use cases like in [6] exist that require the domains/ranges 
of these props?

A remaining problem is that instances of inverse properties are not in 
the RDF data, i.e. there is nothing to access by RDFS tools.
As they cannot be queried anyway (well, queries just don't deliver 
results), we could also opt to make the inverse properties "invisible" 
to RDFS tools by NOT adding rdf:type rdf:Property as for other 
properties in the solution to problem 1.


--------------------------------------------------------------------------------

I also moved some of the classes to a separate "extensions" file with 
a separate namespace, because I think one can argue that they are not 
part of the original WN conceptual model:

   SynsetUsedAsClassifier
   MonosemousWord
   PolysemousWord
   UniqueBeginner

Incidentally, these classes are all "complete" classes, so this rids 
some of the complexity for RDFS tools. Also, if an RDFS tool requires 
the set of instances denoted by these classes, it will not
be very difficult to write a query to select them:


I implemented the "basic model" in [3] and "extensions" in [4].

Advantage of this solution: one basic RDF data and one basic RDF/OWL 
schema which everyone can use, extensions still accessible for those 
who want them.

--------------------------------------------------------------------------------

Small adittions/error corrections:

errors: wn:antonymOf is not inverse of itself, same for
wn:sameVerbGroupAs. Range of wn:adjectivePertainsTo is owl:unionOf 
wn:NounWordSense and wn:AdjectiveWordSense (instead of two separate 
range defs for each class), same for wn:seeAlso's domain and range.

UniqueBeginner was not yet defined with a restriction, changed its def 
to:

Class(a:UniqueBeginner complete
   intersectionOf(unionOf(b:VerbSynset b:NounSynset)
                  restriction(b:hyponymOf allValuesFrom(owl:Nothing))))
--------------------------------------------------------------------------------

Question:

There are two anonymous classes that are defined complete. What are 
they for? They are still in [3], but I guess something has to be done 
about them:

  EquivalentClasses(intersectionOf(a:Word restriction(a:wordInSynset 
someValuesFrom(a:Synset))) intersectionOf(a:Word restriction(a:sense 
someValuesFrom(intersectionOf(restriction(a:inSynset 
someValuesFrom(a:Synset)) a:WordSense)))))

  EquivalentClasses(intersectionOf(restriction(a:synsetContainsWord 
someValuesFrom(a:Word)) a:Synset) 
intersectionOf(restriction(a:containsWordSense 
someValuesFrom(intersectionOf(restriction(a:word 
someValuesFrom(a:Word)) a:WordSense))) a:Synset))




--------------------------------------------------------------------------------


[1] http://www.w3.org/2001/sw/BestPractices/WNET/wordnet_datamodel.owl
[2] http://www.loa-cnr.it/www.loa-cnr.it/Files/wordnet_datamodel_v4.owl
[3] http://www.cs.vu.nl/~mark/wn/wordnet_datamodel_v5.owl
[4] http://www.cs.vu.nl/~mark/wn/wordnet_datamodel_v5ext.owl
[5] http://lists.w3.org/Archives/Public/public-swbp-wg/2005Sep/0035.html
[6] http://lists.w3.org/Archives/Public/public-swbp-wg/2005Sep/0033

-- 
  Mark F.J. van Assem - Vrije Universiteit Amsterdam
        mark@cs.vu.nl - http://www.cs.vu.nl/~mark

Received on Tuesday, 20 September 2005 13:20:57 UTC