- From: Ivan Herman <ivan@w3.org>
- Date: Tue, 10 Jun 2008 09:39:21 +0200
- To: "Seaborne, Andy" <andy.seaborne@hp.com>, Dave Beckett <dave@dajobe.org>
- CC: Manu Sporny <msporny@digitalbazaar.com>, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>
- Message-ID: <484E2FA9.9060808@w3.org>
Seaborne, Andy wrote: > The input from PyRDFa is RDF/XML with a parseType literal. That gets XML Exclusive Canonicalization applied (required of RDF/XML parsers - parseType literal does not exist in N3/Turtle/N-Triples, only > datatype XMLLiteral and parsers for those serializations do not canonicalize). > > http://www.w3.org/TR/rdf-syntax-grammar/#section-grammar-productions > 7.2.17 - bullet point 2. > > http://www.w3.org/TR/rdf-syntax-grammar/#section-Syntax-XML-literals > Yes. But, I must admit, it surprises me that if I use explicit XMLLiteral datatype, the same would not apply. _I see_ in the RDF/XML parsing rules that this route is not mentioned, ie, you are right in reading the spec, but I wonder whether this is not a bug. parseType="Literal" ought to be merely an abbreviation for the explicit datatype setting... Dave, as editor of this document, what do you think? If there is a bug here, it would be worth recording it formally, so that it could be reopened if ever we touch RDF/XML again... However: this should _not_ be the job of the RDFa group and should not influence RDFa... Ivan > This means two changes are made to what is sent over the wire: > > 1/ Unused namespaces are removed (e.g on <strong ..>, there was a xmlns:svg) > 2/ <svg:rect/> is replaced by <svg:rect></svg:rect> > > You can see what ends up in the store by asking: > > SELECT ?o { ?s ?p ?o } > > which is on that PyRDFa's output: > > http://www.sparql.org/sparql?query=SELECT+%3Fo+%7B+%3Fs+%3Fp+%3Fo%7D&default-graph-uri=http%3A%2F%2Fwww.w3.org%2F2007%2F08%2FpyRdfa%2Fextract%3Furi%3Dhttp%3A%2F%2Fwww.w3.org%2F2006%2F07%2FSWD%2FRDFa%2Ftestsuite%2Fxhtml1-testcases%2F0100.xhtml&stylesheet=%2Fxml-to-html.xsl > > > I applied the same canonicalization to the query (canonicalization of the object in the query): > > ASK WHERE { > <http://www.example.org> <http://example.org/rdf/example> "Some text here in <strong xmlns=\"http://www.w3.org/1999/xhtml\">bold</strong> and an svg rectangle: <svg:svg xmlns:svg=\"http://www.w3.org/2000/svg\"><svg:rect svg:height=\"100\" svg:width=\"200\"></svg:rect></svg:svg>"^^<http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral> > . > } > > which is the query and datasource: > > http://www.sparql.org/sparql?query=ASK+WHERE+%7B%0D%0A%3Chttp%3A%2F%2Fwww.example.org%3E+%3Chttp%3A%2F%2Fexample.org%2Frdf%2Fexample%3E+%22Some+text+here+in+%3Cstrong+xmlns%3D%5C%22http%3A%2F%2Fwww.w3.org%2F1999%2Fxhtml%5C%22%3Ebold%3C%2Fstrong%3E+and+an+svg+rectangle%3A+%3Csvg%3Asvg+xmlns%3Asvg%3D%5C%22http%3A%2F%2Fwww.w3.org%2F2000%2Fsvg%5C%22%3E%3Csvg%3Arect+svg%3Aheight%3D%5C%22100%5C%22+svg%3Awidth%3D%5C%22200%5C%22%3E%3C%2Fsvg%3Arect%3E%3C%2Fsvg%3Asvg%3E%22%5E%5E%3Chttp%3A%2F%2Fwww.w3.org%2F1999%2F02%2F22-rdf-syntax-ns%23XMLLiteral%3E%0D%0A+.%0D%0A%7D%0D%0A&default-graph-uri=http%3A%2F%2Fwww.w3.org%2F2007%2F08%2FpyRdfa%2Fextract%3Furi%3Dhttp%3A%2F%2Fwww.w3.org%2F2006%2F07%2FSWD%2FRDFa%2Ftestsuite%2Fxhtml1-testcases%2F0100.xhtml&stylesheet=%2Fxml-to-html.xsl > > and I get "true" > > > SPARQL does not require canonicalization. It is possible to get non-canonicalized XML literals into the data by using datatype XMLLiteral even in RDF/XML. > > Andy > >> -----Original Message----- >> From: Manu Sporny [mailto:msporny@digitalbazaar.com] >> Sent: 9 June 2008 15:32 >> To: Seaborne, Andy >> Cc: RDFa mailing list; Dave Beckett >> Subject: Re: Issue with Jena/sparql.org and XML Literals? >> >> Seaborne, Andy wrote: >>> You have to be carefule with line endings as well - \n vs \n\r etc. >>> The SPARQL parser does not canonicalize XMLLiterals in the query. >> There are no \n or \n\r in either the input RDF or the SPARQL, so that >> shouldn't be an issue. Thanks for mentioning that, however. >> >>> 2008Jun/0027.html ==> >>> [[ >>> If you look at librdfa's output for TC100: >>> >> http://rdfa.digitalbazaar.com/librdfa/rdfa2rdf.py?uri=http://www.w3.org/20 >> 06/07/SWD/RDFa/testsuite/xhtml1-testcases/0100.xhtml >>> and PyRDFa's output for TC100: >>> >> http://www.w3.org/2007/08/pyRdfa/extract?uri=http://www.w3.org/2006/07/SWD >> /RDFa/testsuite/xhtml1-testcases/0100.xhtml >>> ]] >>> >>> If I understand these correctly, these are different. >> Yes, the XML Literals that are generated are different and the SPARQL >> tests two "valid" XML Literals. The first test in the SPARQL will match >> librdfa's output, the second should match PyRDFa's output. It is the >> second SPARQL test (the last part of the UNION) that is failing for some >> unknown reason. >> >>> Running CURL on the first I get data with multiple namespaces on each >>> element, and I don't on the second. >> On the second one (PyRDFa's output), you should get two namespaces, the >> standard XHTML one and the standard SVG one. This is the expected >> behavior and I believe the SPARQL is setup to test exactly that. >> >>> N-Triples files for each attached (rdfparse run on the CURL results of >>> each link.. You will see they both have XMLLiterals but are different >> sizes. >> >> Yup, that is expected. The SPARQL test has two variations that are >> valid... the second variation should be passing, but it doesn't. >> >> -- manu >> >> -- >> Manu Sporny >> President/CEO - Digital Bazaar, Inc. >> blog: Dynamic Spectrum Auctions and Digital Marketplaces >> http://blog.digitalbazaar.com/2008/04/24/dynamic-spectrum-auctions/ -- Ivan Herman, W3C Semantic Web Activity Lead Home: http://www.w3.org/People/Ivan/ PGP Key: http://www.ivan-herman.net/pgpkey.html FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Tuesday, 10 June 2008 07:39:49 UTC