W3C home > Mailing lists > Public > public-rdf-in-xhtml-tf@w3.org > November 2007

Re: [Fwd: ISSUE-63: White-Space Canonicalization of XML Literals]

From: Ivan Herman <ivan@w3.org>
Date: Wed, 14 Nov 2007 02:04:09 +0100
Message-ID: <473A4989.8010306@w3.org>
To: Ben Adida <ben@adida.net>
Cc: public-rdf-in-xhtml-tf@w3.org

Ben Adida wrote:
> Earlier, Ivan said:
>> A full canonicalization in Python is also not that easy. Getting the
>> repeated white space characters out and stripping the first and last
>> whitespace is a breeze. The rest becomes a real headache unless the
>> underlying XML library does it (eg, ordering the attributes). I wonder
>> whether we should really require that. What do we gain?
> A good question, I wonder what we gain...
> Manu said:
>> This means that we can (and should, IMHO) preserve all of the formatting
>> in the original document for XML Literals.
>> Sorry for the previous post stating that we didn't have a choice, I had
>> not considered getting at the original document using XMLHTTPRequest.
> I think this breaks down if the page is the result of a POST. And you
> don't want to resubmit a POST, of course.
> Having read the full thread, let me first write down what we agree on:
> plain literals should be canonicalized according to XPath
> normalize-space(), which is Mark's proposal.
> Now, what to do with XMLLiterals. Here's my proposal, which is going to
> sound a lot like punting:
> "Where possible, an RDFa parser should preserve the exact white space
> and characters of the XML Literal. However, it is also acceptable for an
> RDFa parser to apply browser-based canonicalization."

I think this is a viable approach. To move on: let us put that into the
document in last call, and we can always flag this as an explicit issue
we seek further feedback from the community.

> The assumption is that we're dealing with the host language here,
> XHTML1.1, and if an XML Literal is canonicalized in a way that preserves
> how it renders in XHTML, then who cares? I understand this may limit the
> round-trippiness of RDFa->RDF->RDFa, but that may simply be a limitation
> of what browsers and the DOM does in XHTML1.1.
> I suppose this makes writing test cases problematic... I suspect we
> should write the tests to preserve white space and characters, and judge
> each browser canonicalization individually.

You mean implementation, right?



> Thoughts?
> -Ben


Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Wednesday, 14 November 2007 01:04:16 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:50:25 UTC