Release 0.7.1 and bug summary

Cwm release 0.7.1 was made today 2004-03-04.

It was inspired by bug fixes and one thing leading to another.
Most of the changes are in the RDF/XML and N3 parsers and generators.
The following details are also in
http://www.w3.org/2000/10/swap/doc/changes.html

Bugs in public-n3-bugs@w3.org
http://lists.w3.org/Archives/Public/public-cwm-bugs
are currently all closed. Many bugs from the previous generation of
bug reports (to www-archive+n3bugs@w3.org) are still open
http://www.w3.org/2000/10/swap/admin/N3-Bugs.ics
though many may be out of date. I have yet to do a triage on them.


There was a bug that xml:base was not resepcted in the default parser.

Rather than generate a new test for xml:base, it seemed more 
appropriate to revist the RDF Core parser tests.  The regression test 
engine will run off these tests, but they had not been incorporated 
into the regression test. They now have.

The cant.py NTriple canonicalizer which is used for testing had a bug, 
which is now fixed. While being fixed, cant.py was enhanced to do an 
internal canonicalization and then comparison of two files  (python 
cant.py -f file1 -d file2) to have having to canonicalize the two files 
and then compare them.  This makes the tests faster.

Things which are not testd in regression test at the moment include  
rdflib as a parser,  --closure= flags, and the API.

RDF Parser improved

The cwm regresssion test now incorporates the RDF Core Positive Parser 
Tests except for those  which deal with reification or with XML 
literals. In the process, xml:base supposrt was added in the parser.

A new test found in the updated core tests requires RDF to be parsed 
even when there is no enveloping <rdf:RDF> tag, even if the outermost 
element is a typed node production, and so not something in the RDF 
namespace at all. This makes rdf much less self-describing, and makes 
it more dangerous that one might parse say an HTML file as RDF by 
accident. Use with care.  If need this feature, use the --rdf=R flag.

The RDF core tests are done with --rdf=RT to make the parser parse 
naked RDF or RDF buried in foreign XML..

nodeid generated on RDF output

This has been a missing feature of the RDF generator for a while. The 
nodeid feature allows bnodes to be output in RDF/XML. I may not have 
got this right, as I don't have RDF generation tests, only RDF parse 
tests.

Ordering of output

The ordering of Terms has been changed. Automatically generated terms 
with no URIs sort after anything which has a URI.
This will change the order of N3 and RDF/XML output but does not change 
its semantics.s

Namespace prefix smarts on output

Cwm now does output in a two-pass process. This makes its counting of 
the number of occurrences of namespaces more acurate, which determines 
the default namespace it choses. This does take more time, though not 
as long as the previous method of working out which was going to be 
most common. To skip this process, use the "d" flag on output (N3 or 
RDF/XML) to suppress the use of a default namespace.
Because this counting is now accurate, it now suppresses namespace 
prefix declarations which are not actually needed in the output.

Cwm will also make up prefixes when it needs them for a namespace, and 
none of the input data uses one. It peeks into the the namespace URI, 
and looks around for a short string after the last "/", adding numbers 
if necessary to make the prefix unique.

Namespaces without hashes

Cwm when writing N3 not normally use namespace names for URIs which do 
not have a "#". Including a "/" in the flags overrides this.
cwm mydcdata.n3 --n3="/" Namespaces which end in "/" are 
architecturally flawed as the names of things in the namespace look 
like HTTP documents, whatever they are. The most notorious miscreatnts 
here are Dublin Core and FOAF. If you use these namespaces, you may 
want to do this, or you may want to encourage the authors to change to 
use a "#


Tim BL

Received on Saturday, 6 March 2004 23:24:48 UTC