- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Sun, 16 Feb 2003 20:18:47 +0000
- To: Sean Bechhofer <seanb@cs.man.ac.uk>
- CC: WebOnt WG <www-webont-wg@w3.org>
> > And if someone actually has implemented a streaming OWL parser, simply let > me have it and then I'll shut up and go away :-) > Well - I have a streaming RDF/XML parser - The standard mode of operation keeps the rdf:nodeID's in parser memory (which is not constant size) but for very large files it is possible to have the client keep these (e.g. on a disk). I have done some experiments with the following design pattern: RDF/XML => N-triples (essentially O(n) not proven) (store N-triples file on disk) N-triple => sort O(nlog(n)) ... further processing. It is not completely clear what your underlying requirement is - you are goind to need to store the ontology somewhere - so this seems to be about how to turn RDF/XML into the abstract syntax form efficiently. Sorted triples allow you to build most of the abstract constructs quite easily. You say you don't want to use Jena RDB, but at some point an ontology will necessarily be too big to fit in memory and you have to use disk. There are inevitably long distance interactions in understanding an ontology - and there are a few extra ones in making sense of RDF/XML or OWL as RDF/XML. Also Jena is still a long way from being fully engineered and optimized - in fact, the amount of optimization work done on Jena can be no more that days of effort - this means that you hit the point where your memory runs out sooner than you might guess was appropriate. Jeremy
Received on Sunday, 16 February 2003 15:19:05 UTC