sirpac proposal (concrete)

Hello!

I would like to make a proposal for SiRPAC improvement (again, but more
concrete). In order to simplify discussion, I split my proposal into
three little independent parts. If the proposals are (partially) agreed,
I could try to take over some implementation if that helps the SiRPAC
team. My proposals are:


1. Make SiRPAC itself an RDF consumer

Currently, if I register an RDF consumer to SiRPAC, SiRPAC still fills
it's internal triple storage (m_triples). 

If I have very large RDF files, and would like to use SiRPAC to parse
these files in order to put the content into a database, this behavior
can blow up the memory of the machine. 

My proposal is:

- SiRPAC itself implements RDFConsumer

- if no RDFConsumer is registered, SiRPAC 
  registers itself as the default consumer

- if an RDFConsumer is registered, SiRPAC 
  does not consume the assert events 


A further advantage is:

- If I have another RDF parser, e.g. for another 
  RDF syntax, I could use the SiRPAC default RDFConsumer 
  to let SiRPAC build its triple list. 

Currently, it seems possible to achive the desired behaviour 
by implementing a subclass of SiRPAC overwriting the addTriple
method. But then I would disable the other handlers...
Also, having two different mechanisms for the same purpose
looks not very consistent.


2. Split the parser from the SiRPAC compiler and put it into 
   a sub-packages org.w3c.rdf.parser

Currently, it is difficult to find the relevant methods 
in the SiRPAC class because it is overloaded 
with all the public parser and DocumentHandler methods.
That makes using SiRPAC look much more complicated than
it is. 

Interface readability could be further improved by
moving the SiRPAC examples out of the
main package.


3. Modify the RDFConsumer interface 

I would like to propose to replace 

void assert (DataSource ds, Property predicate, 
             Resource subject, RDFnode object);

by 

void assert (String predicate, String subject
             int objectType, Object object);

where objectType is one of LITERAL, RESSOURCE, LITERAL_XML.
object.toString () MUST return a valid String
representation of the original content. Allows
DOM walker parsers to return a (DOM-)Node that
can be reused. 

Building a Propery, Ressource and RDFnode object
from the Strings is left over to the processor
since some Processors may have their own implementation
of these objects. 

I am very interested in your opinion.

Best regards

Stefan Haustein

-- 
KJAVA AWT project: www.trantor.de/kawt
SAX-based access to WBXML and WML: www.trantor.de/wbxml

Received on Sunday, 9 January 2000 10:25:17 UTC