W3C home > Mailing lists > Public > public-rdf-in-xhtml-tf@w3.org > June 2008

RE: Issue with Jena/sparql.org and XML Literals?

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Mon, 9 Jun 2008 17:23:25 +0000
To: Manu Sporny <msporny@digitalbazaar.com>
CC: RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>, Dave Beckett <dave@dajobe.org>
Message-ID: <B6CF1054FDC8B845BF93A6645D19BEA33C8C4B9E62@GVW1118EXC.americas.hpqcorp.net>
The input from PyRDFa is RDF/XML with a parseType literal. That gets XML Exclusive Canonicalization applied (required of RDF/XML parsers - parseType literal does not exist in N3/Turtle/N-Triples, only
datatype XMLLiteral and parsers for those serializations do not canonicalize).


7.2.17 - bullet point 2.


This means two changes are made to what is sent over the wire:

1/ Unused namespaces are removed (e.g on <strong ..>, there was a xmlns:svg)
2/ <svg:rect/> is replaced by <svg:rect></svg:rect>

You can see what ends up in the store by asking:

SELECT ?o { ?s ?p ?o }

which is on that PyRDFa's output:


I applied the same canonicalization to the query (canonicalization of the object in the query):

<http://www.example.org> <http://example.org/rdf/example> "Some text here in <strong xmlns=\"http://www.w3.org/1999/xhtml\">bold</strong> and an svg rectangle: <svg:svg xmlns:svg=\"http://www.w3.org/2000/svg\"><svg:rect svg:height=\"100\" svg:width=\"200\"></svg:rect></svg:svg>"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral>

which is the query and datasource:


and I get "true"

SPARQL does not require canonicalization.  It is possible to get non-canonicalized XML literals into the data by using datatype XMLLiteral even in RDF/XML.


> -----Original Message-----
> From: Manu Sporny [mailto:msporny@digitalbazaar.com]
> Sent: 9 June 2008 15:32
> To: Seaborne, Andy
> Cc: RDFa mailing list; Dave Beckett
> Subject: Re: Issue with Jena/sparql.org and XML Literals?
> Seaborne, Andy wrote:
> > You have to be carefule with line endings as well - \n vs \n\r etc.
> > The SPARQL parser does not canonicalize XMLLiterals in the query.
> There are no \n or \n\r in either the input RDF or the SPARQL, so that
> shouldn't be an issue. Thanks for mentioning that, however.
> > 2008Jun/0027.html ==>
> > [[
> > If you look at librdfa's output for TC100:
> >
> http://rdfa.digitalbazaar.com/librdfa/rdfa2rdf.py?uri=http://www.w3.org/20

> 06/07/SWD/RDFa/testsuite/xhtml1-testcases/0100.xhtml
> >
> > and PyRDFa's output for TC100:
> >
> http://www.w3.org/2007/08/pyRdfa/extract?uri=http://www.w3.org/2006/07/SWD

> /RDFa/testsuite/xhtml1-testcases/0100.xhtml
> > ]]
> >
> > If I understand these correctly, these are different.
> Yes, the XML Literals that are generated are different and the SPARQL
> tests two "valid" XML Literals. The first test in the SPARQL will match
> librdfa's output, the second should match PyRDFa's output. It is the
> second SPARQL test (the last part of the UNION) that is failing for some
> unknown reason.
> > Running CURL on the first I get data with multiple namespaces on each
> > element, and I don't on the second.
> On the second one (PyRDFa's output), you should get two namespaces, the
> standard XHTML one and the standard SVG one. This is the expected
> behavior and I believe the SPARQL is setup to test exactly that.
> > N-Triples files for each attached (rdfparse run on the CURL results of
> > each link..  You will see they both have XMLLiterals but are different
> sizes.
> Yup, that is expected. The SPARQL test has two variations that are
> valid... the second variation should be passing, but it doesn't.
> -- manu
> --
> Manu Sporny
> President/CEO - Digital Bazaar, Inc.
> blog: Dynamic Spectrum Auctions and Digital Marketplaces
> http://blog.digitalbazaar.com/2008/04/24/dynamic-spectrum-auctions/

Received on Monday, 9 June 2008 17:24:30 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:01:57 UTC