- From: Sean B. Palmer <sean+wa@infomesh.net>
- Date: Tue, 19 Oct 2004 21:11:44 +0100
- To: eikeon@eikeon.com
- CC: www-archive@w3.org
Hi eikeon, Noting that both of our N-Triples parsers are rather bug-ridden, I've written a replacement for them both: http://inamidst.com/proj/rdf/ntriples.py - N-Triples Parser It won't slot into either rdflib or pyrple as it is at the moment, but to make it fit into rdflib it should be a trivial matter of writing a five-or-so line wrapper. To make it slot into pyrple might take a bit of rewriting since having seen rdflib I've reconsidered some of the architecture. Generally, I think it's a good idea to have the parsers do all the real parsing work instead of retaining any of the unparsed input, so instead of pretending that that can be independent, ntriples.unquote is now doing all of that work--and hopefully without any of the bugs of our old versions! It's updated to the latest version of the specification, so the regular expression now doesn't allow you to parse a literal with both a language and a datatype. I've been careful to make this code as efficient as possible, even going so far as to benchmark a couple of different approaches for passing through safe literal characters (regexp won over sets). It reads buffered input via a custom readline method, and then does a recursive descent/regexp parse on the lines, and even handles the fact that URIs in N-Triples can use two different escaping mechanisms. It's not as flexible as my old pyrple module (which allowed universally quantified variables and literals as subjects, optionally), but all it requires to modify it are a few changes to some of the highly modularised methods. Of course, this is just the very start--the low hanging fruit. It'd be nice if we could start thinking about some of the fundamental design issues; a few examples: * Should there be separate stores for database/in-memory, or should that be configurable as an option to Graph/TripleStore? * What should we call Graph/TripleStore anyway? I named it Graph after amk's sketch of an RDF API [1], where he says: "g = rdf.Graph() # Call this model/dataset/something else?" * Do we really need to have a separate Schema/Ontology class? Have you had a chance to go through the pyrple code yet? I'd really like to just take the union of features as much as possible from our APIs, then we can have separate stuff building on top of that. For example, rdflib's subject_predicates is a convenience function (and I feel it makes code less readable), as is my getRules method etc. Can't wait until we compare our query approaches :-) [1] http://www.amk.ca/conceit/rdf-interface.html -- Sean B. Palmer, http://inamidst.com/sbp/
Received on Tuesday, 19 October 2004 20:12:16 UTC