Efficiency/scalability from Jim Hendler on 2007-12-19 (public-owl-wg@w3.org from December 2007)

From: Jim Hendler <hendler@cs.rpi.edu>
Date: Tue, 18 Dec 2007 23:12:19 -0500
To: OWL Working Group WG <public-owl-wg@w3.org>
Message-Id: <8C2E1496-87AD-4338-9583-937630F31DC5@cs.rpi.edu>

One thing that has come up in the fragments discussion has been the  
whole issue of provable properties vs. real-world scaling (sometimes  
called theoretical efficiency vs effectiveness) - question were  
raised about the theoretical properties of some of the RDFS 3.0  
stuff, where I can only say, we're still exploring this, but I note that

http://www.bigdata.com/projects/multiproject/bigdata-rdf/index.html

reports on the handling of  "entailments ...for RDF Schema,  
owl:sameAs, owl:equivalentProperty, and owl:equivalentClass" at  
speeds that are pretty amazing (load at 21,000 triples per second,  
compute at 8100 entailments for second in computing the RDFS+ closure  
for Wordnet) and in mail on the billion triples mailing list they've  
proposed that we up the challenge to 10B triples to make things  
challenging...

i realize they are still far less expressive than the RDFS 3.0  
proposal (or any of our fragments) and they have no negations and use  
the realized triples trick to create a finite universe - it's just  
that 10^9 is a pretty big finite universe, and it's important to  
realize that RDF DBs are reaching those sizes already - and including  
some RDFS and OWL constructs, so the Abox stuff is really getting  
impressive (and hard to ignore)
  -JH
p.s. this is no means meant to endorse bigdata-rdf, a project I know  
nothing about beyond what is on that web site.

"If we knew what we were doing, it wouldn't be called research, would  
it?." - Albert Einstein

Prof James Hendler				http://www.cs.rpi.edu/~hendler
Tetherless World Constellation Chair
Computer Science Dept
Rensselaer Polytechnic Institute, Troy NY 12180

Received on Wednesday, 19 December 2007 04:12:33 UTC