More on query and report generator

Some further news on the query/report generator software I mentioned a 
little while ago [1].

Some real results from this software are now available in the form of some 
proposed email header field registry data [2].  I've been working with Mark 
Nottingham to prepare a registry of mail, http and other protocol headers 
fields loosely belonging to the RFC822/MIME family.

The software itself [3] has been updated since the original announcement to 
fix some N3 parser bugs, to support queries with optional and alternative 
variable bindings in query patterns, and to support corresponding optional 
and alternative sections in formatter templates.

My N3 data for mail headers was scraped by hand from HTML.  Mark is using 
cwm to generate data for HTTP headers.  The report generator was 
subsequently used to generate both HTML and RFC2629/XML data, the latter 
being used to generate documents in plain text, HTML and nroff formats.

How did using RDF/N3 help?
- reducing choices:  by providing an existing, flexible data format as a 
starting point for design
- it was easy to leverage and build upon existing schema design work 
(notably foaf:)
- not being forced to work within predefined schema constraints has made it 
very easy to develop and evolve the information content
- simplicity:  compared with a language like XPath, the basic query pattern 
language is almost ridiculously easy.
- communication:  using N3/RDF has made it easy for Mark and I to 
communicate our ideas about data structuring.
- exchanging/combining information derived from different sources:  we are 
starting to work with information derived from different sources, and 
having very different presentation structure;  the uniformity of RDF's 
underlying graph model makes it easy to combine such information.

Any disadvantages?:

- report generator performance is currently very poor.  I think this is 
because I'm interpreting "little languages" directly out of a simplistic 
RDF/N3 database.  I'm fairly confident I can make some dramatic 
improvements here, so the jury is still out on this.  I'm also using double 
interpretation:  my "little languages" are interpreted by an interpreter 
that is itself written in an interpreted language (Python).
- While N3 is a very easy language for representing RDF (much easier for 
human use than RDF/XML), and it has been used with some success to 
represent my query and formatter "little languages", it's not the most 
obvious format to use.  (I'm reminded of using a macro assembler many years 
ago, to code interpreted data structures.)  If I end up doing a lot of work 
with these little languages, I'd want to design a presentation syntax that 
is easily compiled to RDF.





Graham Klyne

Received on Thursday, 9 May 2002 02:51:43 UTC