CWM Bug: parseType="Literal" Consumption and Production

In trying to get cwm-1.2.0a1 to output an rdf:XMLLiteral without
escaping it (is this possible?), I found a valid RDF/XML file that it
can't parse:

$ cat test.rdf
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
   xmlns="http://example.org/#">
<rdf:Description rdf:about="http://example.org/#p">
 <q rdf:parseType="Literal"><p xmlns="">...</p></q>
</rdf:Description>
</rdf:RDF>

(File validates according to the W3C RDF Validator.)

$ cwm --rdf test.rdf
Traceback (most recent call last):
  File "/usr/local/bin/cwm", line 740, in <module>
    doCommand()
  File "/usr/local/bin/cwm", line 451, in doCommand
    why=myReason)
  File ".../python2.5/site-packages/swap/webAccess.py", line 184, in load
    p.feed(buffer)
  File ".../python2.5/site-packages/swap/sax2rdf.py", line 769, in feed
    self._p.feed(data)
  File ".../python2.5/xml/sax/expatreader.py", line 207, in feed
    self._parser.Parse(data, isFinal)
  File ".../python2.5/xml/sax/expatreader.py", line 360, in start_namespace_decl
    self._cont_handler.startPrefixMapping(prefix, uri)
  File ".../python2.5/site-packages/swap/sax2rdf.py", line 362, in
startPrefixMapping
    uri = self.uriref(uri)
  File ".../python2.5/site-packages/swap/sax2rdf.py", line 209, in uriref
    return uripath.join(self._base, str)
  File ".../python2.5/site-packages/swap/uripath.py", line 103, in join
    slashl = find(there, '/')
  File ".../python2.5/string.py", line 359, in find
    return s.find(*args)
AttributeError: 'NoneType' object has no attribute 'find'

It doesn't work in (my patched...) cwm-1.0.0 either, but it does give
a friendlier error message:

  File ".../cwm-1.0.0/swap/sax2rdf.py", line 456, in startElementNS
    raise RuntimeError("This version of sax2rdf.py does not support
parseType=Literal.")
RuntimeError: This version of sax2rdf.py does not support parseType=Literal.

At any rate, my main goal here is to get it to output a
parseType="Literal" section; please consider that the main bug raised
by this email. Consumption, for my current needs, is second in
importance.

An example of what I've tried for production:

$ echo '@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
        :p :xml "<p>...</p>"^^rdf:XMLLiteral .' | cwm --n3 --rdf
<rdf:RDF xmlns="file:/...#"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
 <rdf:Description rdf:about="#p">
  <xml rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral"
       >&#60;p&#62;...&#60;/p&#62;</xml>
 </rdf:Description>
</rdf:RDF>

I'm not sure if this is really equivalent or not. I mean, I wouldn't
have thought so, but when I bung the test.rdf document at the top of
this bug report into the RDF Validator, the literal that it gives is:

"&lt;p&gt;...&lt;/p&gt;"^^http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral

Is the RDF Validator wrong about this too? Shouldn't it be
"<p>...</p>"^^&c.? Look what happens when I feed it back through cwm:

$ echo ':p :q "&lt;p&gt;...&lt;/p&gt;"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>
.' | cwm --n3 --rdf
<rdf:RDF xmlns="...#"
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    <rdf:Description rdf:about="#p">
        <q rdf:datatype="http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral">&#38;lt;p&#38;gt;...&#38;lt;/p&#38;gt;</q>
    </rdf:Description>
</rdf:RDF>

Either the validator is escaping when it shouldn't do, or cwm isn't
unescaping when it should do, as far as I understand the situation.

-- 
Sean B. Palmer, http://inamidst.com/sbp/

Received on Saturday, 13 October 2007 13:40:38 UTC