RE: rdfapi question

So speed is the big issue here I assume...

You may end up having to write some custom parsing code.
A generic rdf parser will assume a more flexible rdf
than you are using, so many of the things you are trying
to avoid may happen behind the scenes before the parser
even gives you your data, in other words, putting a race 
car driver in a golf cart won't get you there much faster

Look for a parser that will stream the data to you, rather
than one that processes it all and gives you a blob. (SAX vs DOM)
I know the rdf api does the blob version. I believe though 
that I saw somewhere the newer version of the api supports 
the faster method. If it does, then look into that. If it 
doesn't then look for a generic SAX parser. Since you're
using a restricted version of rdf, fitting the parser to grab
your stuff won't be as difficult as a full blown rdf parser.

Note though that you can't avoid searching for parents, unless
your hierarchy is a tree with no branches.
Even in a simple case:
A
B subOf A
C subOf A
once you get to C, you still have to search for A to make the link
to it.



-----Original Message-----
From: Chris Cera [mailto:cera@drexel.edu]
Sent: Tuesday, May 08, 2001 11:46 AM
To: Balon, Corey
Cc: 'Chris Cera'; www-rdf-interest@w3.org
Subject: Re: rdfapi question


* Balon, Corey <cbalon@grci.com> [010508 15:13]:
> You're never guarenteed to even have to the superclass
> declared before the subclass in the rdf. So, even if the
> triples come out in order, you'll still have this problem

I would design my ontology with this in mind, thus to avoid the
search problem entirely.  The problem is that the interface for
Model doesn't require the Enumeration of triplets returned by
elements () to be in any particular order, I'm wondering if 
somebody knows of something that can create (or implemented
already) this in order as exactly defined in the rdfs/daml file 
you're reading.

The 2nd of your methods involve searching through some container 
of class objects to retrieve objects that need to be have data
members set b/c we have received new information about an object
we already created.  I suppose I could insert some sort of key 
value for the classes so then a heap or better data structure 
could be created, thus to guarantee O (log n) for the search 
operation.  Any suggestions on the best data structure and/or 
algorithm.

This is for a CSCW tool which must send strings over the network 
and be in as close to realtime as possible for synchronized 
graphics display.  I'm planning on requiring this extra 
restriction on the ontology to eliminate the time for the search.

> 1) do two passes. on pass one, create all the objects and keep
> 	a mapping from the object's URI to the object. In pass two
> 	do the relationships between the objects (looking up the objects
> 	in the map).

I suppose that hashing on that would be the least expensive 
method for this approach..  This would be probably be the most
robust since the rdf file wouldn't require the ordering
restriction (the way its supposed to be).  Thanks alot, any 
better suggestions would be greatly appreciated.

Received on Tuesday, 8 May 2001 14:02:48 UTC