- From: Ivan Herman <ivan@w3.org>
- Date: Wed, 14 Nov 2007 02:04:09 +0100
- To: Ben Adida <ben@adida.net>
- Cc: public-rdf-in-xhtml-tf@w3.org
- Message-ID: <473A4989.8010306@w3.org>
Ben Adida wrote: > > Earlier, Ivan said: >> A full canonicalization in Python is also not that easy. Getting the >> repeated white space characters out and stripping the first and last >> whitespace is a breeze. The rest becomes a real headache unless the >> underlying XML library does it (eg, ordering the attributes). I wonder >> whether we should really require that. What do we gain? > > A good question, I wonder what we gain... > > Manu said: >> This means that we can (and should, IMHO) preserve all of the formatting >> in the original document for XML Literals. >> >> Sorry for the previous post stating that we didn't have a choice, I had >> not considered getting at the original document using XMLHTTPRequest. > > I think this breaks down if the page is the result of a POST. And you > don't want to resubmit a POST, of course. > > Having read the full thread, let me first write down what we agree on: > plain literals should be canonicalized according to XPath > normalize-space(), which is Mark's proposal. > > Now, what to do with XMLLiterals. Here's my proposal, which is going to > sound a lot like punting: > > "Where possible, an RDFa parser should preserve the exact white space > and characters of the XML Literal. However, it is also acceptable for an > RDFa parser to apply browser-based canonicalization." > I think this is a viable approach. To move on: let us put that into the document in last call, and we can always flag this as an explicit issue we seek further feedback from the community. > The assumption is that we're dealing with the host language here, > XHTML1.1, and if an XML Literal is canonicalized in a way that preserves > how it renders in XHTML, then who cares? I understand this may limit the > round-trippiness of RDFa->RDF->RDFa, but that may simply be a limitation > of what browsers and the DOM does in XHTML1.1. > > I suppose this makes writing test cases problematic... I suspect we > should write the tests to preserve white space and characters, and judge > each browser canonicalization individually. > You mean implementation, right? +1 Ivan > Thoughts? > > -Ben > > > -- Ivan Herman, W3C Semantic Web Activity Lead Home: http://www.w3.org/People/Ivan/ PGP Key: http://www.ivan-herman.net/pgpkey.html FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Wednesday, 14 November 2007 01:04:16 UTC