- From: Yosi Scharf <syosi@MIT.EDU>
- Date: Mon, 29 Aug 2005 15:56:02 -0400
- To: public-cwm-talk@w3.org
I was playing with cwm pychinko intergration over the weekend, getting cwm to use the rete much more directly. I then ran some tests, and saw little performance change. This fascinated me. I then looked at the pychinko tests: ----- initial size: 1200 Testing Rules that reuse the same variables on left sides CWM COMMAND: time /home/syosi/CVS-local/WWW/2000/10/swap/cwm.py generatedtests/testfacts.1200.n3 --ntriples --think=rules/sameVarRules.n3 --base=http://www.mindswap.org/~katz/ --purge > generatedtests/testoutput.cwm.1200.n3 408 inferred fact(s) Pychinko time: 3.7090420723 CWM time: real 0m7.027s user 0m6.705s sys 0m0.091s ------- Here we see, pychinko took under 4 seconds on this moderately large fact base, and cwm took twice as long. I ran the numbers myself: ------ syosi@mr-burns:~/pychinko/pychinko$ time /home/syosi/CVS-local/WWW/2000/10/swap/cwm.py generatedtests/testfacts.1200.n3 --ntriples --think=rules/sameVarRules.n3 --base=http://www.mindswap.org/~katz/ --purge > generatedtests/testoutput.cwm.1200.n3 real 0m7.135s user 0m6.624s sys 0m0.104s syosi@mr-burns:~/pychinko/pychinko$ syosi@mr-burns:~/pychinko/pychinko$ time /home/syosi/CVS-local/WWW/2000/10/swap/cwm.py generatedtests/testfacts.1200.n3 --no --think=rules/sameVarRules.n3 --base=http://www.mindswap.org/~katz/ --purge > generatedtests/testoutput.cwm.1200.n3 real 0m3.892s user 0m3.596s sys 0m0.077s syosi@mr-burns:~/pychinko/pychinko$ time python2.4 main.py --facts=generatedtests/testfacts.1200.n3 --rules=rules/sameVarRules.n3 408 inferred fact(s) real 0m4.667s user 0m4.266s sys 0m0.100s syosi@mr-burns:~/pychinko/pychinko$ time /home/syosi/CVS-local/WWW/2000/10/swap/cwm.py generatedtests/testfacts.1200.n3 --think=rules/sameVarRules.n3 --base=http://www.mindswap.org/~katz/ --purge > generatedtests/testoutput.cwm.1200.n3 real 0m7.482s user 0m6.984s sys 0m0.112s ----- basically, what this shows is that the large slowdown associated with cwm in many of these simple rule tests has nothing to do with cwm's reasoner being slow. For these simple rule sets, it may not be. Cwm's outputter is very slow, even in ntriples mode. I then told cwm not to sort the output: ------- syosi@mr-burns:~/pychinko/pychinko$ time /home/syosi/CVS-local/WWW/2000/10/swap/cwm.py generatedtests/testfacts.1200.n3 --ntriples --think=rules/sameVarRules.n3 --base=http://www.mindswap.org/~katz/ --purge > generatedtests/testoutput.cwm.1200.n3 --ugly real 0m5.139s user 0m4.831s sys 0m0.087s -------- As you can see, two seconds were saved simply not sorting the output, and cwm now compares much better to pychinko. Note that cwm is still doing lots of useless work when outputting ntriples, things like figuring out good prefixes are really not necessary. I am by no means saying the cwm's reasoner is perfect, fast, or a long term answer to anything. I'm just pointing out that many problems actually lay elsewhere, and that pretty printing is almost certainly the worst performing part of cwm in many simple cases. Yosi
Received on Monday, 29 August 2005 20:11:26 UTC