Recent changes to FuXi from Chimezie on 2008-06-28 (public-cwm-talk@w3.org from April to June 2008)

From: Chimezie <chimezie@gmail.com>
Date: Sat, 28 Jun 2008 06:46:37 -0700 (PDT)
To: fuxi-discussion <fuxi-discussion@googlegroups.com>
Cc: public-cwm-talk@w3.org
Message-ID: <426b8cba-2530-4432-8c52-eb32f77f9965@s50g2000hsb.googlegroups.com>
I've recently  made a batch of changes to the python-dlp set of
libraries including FuXi.  I've been meaning to write up a short
description of the changes, but instead thought the effort would be
best spent writing up some overall documentation on FuXi.  That is
still a work in progress, but until then below is the changelog and
some commentary on the speed improvements (Note the changes have gone
into the subversion repository [1], but not yet to cheeseshop):

## Additions / Fixes to InfixOWL ##

1. Added the ~ unary operator for constructing complement classes in
Python:

>>> from FuXi.Syntax.InfixOWL import *
>>> Individual.factoryGraph = graph
>>> human = Class(EX.Human)
>>> female= Class(EX.Female)
>>> male = human & ~ female
>>> male.identifier = EX.Male
>>> print graph.serialize(format='n3')

@prefix owl: <http://www.w3.org/2002/07/owl#>.
@prefix ex: <http://example.com/>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.

 ex:Male a owl:Class;
     owl:intersectionOf (
        ex:Human
        [ a owl:Class;
          owl:complementOf ex:Female ] ).
 ex:Female a owl:Class.
 ex:Human a owl:Class.

2. Added changeOperator to BooleanClass (used to change - in place -
the boolean operator used against the operands)

.. snip ..
>>> testGraph.namespace_manager = namespace_
>>> fire  = Class(EX.Fire)
>>> water = Class(EX.Water)
>>> testClass = BooleanClass(members=[fire,water])
>>> testClass
( ex:Fire and ex:Water )
>>> testClass.changeOperator(OWL_NS.unionOf)
>>> testClass
( ex:Fire or ex:Water )
>>> try: testClass.changeOperator(OWL_NS.unionOf)
... except Exception, e: print e
The new operator is already being used!

## Complement Expansion ##

Complement expansion is a syntactic manipulation performed for the
benefit of non-monotonic inference engines (like every N3 reasoner -
cwm/euler/fuxi) that are unable to perform true classic negation
inference.  It expands intersection class descriptions that include
complement expressions by constructing a list of classes that consist
of exactly the classes that are excluded.  It is able to do this by
making assumptions about the completeness of the classes listed in the
OWL ontology used.  Below is an example refactored from the unittest
embedded in FuXi.Horn.__init__.py:

... snip ...
>>> man   = Class(EX.Man)
>>> boy   = Class(EX.Boy)
>>> woman = Class(EX.Woman)
>>> girl  = Class(EX.Girl)
>>> male  = Class(EX.Male)
>>> female= Class(EX.Female)
>>> human = Class(EX.Human)
>>> animal = Class(EX.Animal)
>>> cat = Class(EX.Cat)
>>> dog = Class(EX.Dog)
>>> animal = Class(EX.Animal)
>>> animal = cat | dog | human
>>> human += man
>>> human += boy
>>> human += woman
>>> human += girl
>>> male   += man
>>> male   += boy
>>> female += woman
>>> female += girl
>>> testClass = human & ~ female
>>> repr(testClass)
( ex:Human and ( not ex:Female ) )
>>> newtestClass = ComplementExpansion(testClass,debug=True)
>>> repr(newtestClass)
( ex:Boy or ex:Man )

1. Added complement expansion capabilities to the Fuxi command-line

The  --complementExpand options was added to the Fuxi command-line
script

There were various other fixes (including fixes to the DLP algorithm),
but the primary update consisted of a major performance analysis (and
improvement) of the core Fuxi inference process.  This was done by
targetted refactoring of all list comprehension (and general
management) code to use itertools instead.  The result is that FuXi
can now pass the OWL tests in 1 second!

Note this includes the time spent (for every test), parsing in the OWL
file, converting each OWL graph into an abstract N3 ruleset (using the
DLP algorithm), evaluating the incoming test facts against the
generated ruleset and comparing the results against the expected
triples.  See below:

chimezie-ogbujis-macbook:test chimezieogbuji$ pwd;python testOWL.py
/Users/chimezieogbuji/Projects/python-dlp-devel/trunk/fuxi/test
... snip ...
{'OWL/FunctionalProperty/Manifest002.rdf': '18.0270671844 milli
seconds',
 'OWL/FunctionalProperty/Manifest003.rdf': '7.7018737793 milli
seconds',
 'OWL/FunctionalProperty/Manifest004.rdf': '19.1791057587 milli
seconds',
 'OWL/InverseFunctionalProperty/Manifest002.rdf': '18.9731121063 milli
seconds',
 'OWL/InverseFunctionalProperty/Manifest003.rdf': '4.69088554382 milli
seconds',
 'OWL/InverseFunctionalProperty/Manifest004.rdf': '13.4861469269 milli
seconds',
 'OWL/SymmetricProperty/Manifest001.rdf': '4.44412231445 milli
seconds',
 'OWL/TransitiveProperty/Manifest001.rdf': '5.77807426453 milli
seconds',
 'OWL/allValuesFrom/Manifest001.rdf': '15.1278972626 milli seconds',
 'OWL/complementOf/Manifest001.rdf': '3.0300617218 milli seconds',
 'OWL/differentFrom/Manifest001.rdf': '2.68793106079 milli seconds',
 'OWL/disjointWith/Manifest001.rdf': '14.2018795013 milli seconds',
 'OWL/disjointWith/Manifest002.rdf': '14.5299434662 milli seconds',
 'OWL/intersectionOf/Manifest001.rdf': '28.697013855 milli seconds',
 'OWL/inverseOf/Manifest001.rdf': '3.91292572021 milli seconds',
 'OWL/oneOf/Manifest002.rdf': '23.8239765167 milli seconds',
 'OWL/unionOf/Manifest001.rdf': '14.0280723572 milli seconds'}
ok

----------------------------------------------------------------------
Ran 1 test in 0.937s

OK

Below is the changelog of the most recent commit [2]:

- Fixed the manchester OWL rendering of intersectionOf and updated it
to use the more readable
'that' syntax
- Fixed generateTokenSet so it properly handles RDF collections
- Removed usage of BGL python for generating proof trees
- Overhauled the proof generation so it better integrates with the
fired Rete network
- Proof generation now uses PyDot instead of BGL
- Extensive usage of itertools for more performant list processing
(improves performance all around)
- Fixed namespace bindings for closure graph
- Updated RETE network so fired terminal nodes properly leave a
'tracer' for any subsequent proof
generation
- Properly integrated goal termination so that forward chaining only
continues as long as the goal
has not been met
- N3 serialization of RIF ruleset now uses better spacing for
readability
- Added complement expansion capability
- Fixed DHL conversion of intersectionOf

I'm planning to begin working on documentation (especially
documentation about the proof generation mechanism).  If anyone has
suggestions on how to go about this, wishes to help, or has any
questions about usage, capabilities, etc.. please don't hesitate to
ask.

-- Chimezie

[1] http://code.google.com/p/python-dlp/source/checkout
[2] http://code.google.com/p/python-dlp/source/detail?r=202
Received on Saturday, 28 June 2008 13:47:31 UTC