Re: [Fwd: ISSUE-63: White-Space Canonicalization of XML Literals]

Ben Adida wrote:
> 
> Earlier, Ivan said:
>> A full canonicalization in Python is also not that easy. Getting the
>> repeated white space characters out and stripping the first and last
>> whitespace is a breeze. The rest becomes a real headache unless the
>> underlying XML library does it (eg, ordering the attributes). I wonder
>> whether we should really require that. What do we gain?
> 
> A good question, I wonder what we gain...
> 
> Manu said:
>> This means that we can (and should, IMHO) preserve all of the formatting
>> in the original document for XML Literals.
>>
>> Sorry for the previous post stating that we didn't have a choice, I had
>> not considered getting at the original document using XMLHTTPRequest.
> 
> I think this breaks down if the page is the result of a POST. And you
> don't want to resubmit a POST, of course.
> 
> Having read the full thread, let me first write down what we agree on:
> plain literals should be canonicalized according to XPath
> normalize-space(), which is Mark's proposal.
> 
> Now, what to do with XMLLiterals. Here's my proposal, which is going to
> sound a lot like punting:
> 
> "Where possible, an RDFa parser should preserve the exact white space
> and characters of the XML Literal. However, it is also acceptable for an
> RDFa parser to apply browser-based canonicalization."
>

I think this is a viable approach. To move on: let us put that into the
document in last call, and we can always flag this as an explicit issue
we seek further feedback from the community.

> The assumption is that we're dealing with the host language here,
> XHTML1.1, and if an XML Literal is canonicalized in a way that preserves
> how it renders in XHTML, then who cares? I understand this may limit the
> round-trippiness of RDFa->RDF->RDFa, but that may simply be a limitation
> of what browsers and the DOM does in XHTML1.1.
> 
> I suppose this makes writing test cases problematic... I suspect we
> should write the tests to preserve white space and characters, and judge
> each browser canonicalization individually.
> 

You mean implementation, right?

+1

Ivan

> Thoughts?
> 
> -Ben
> 
> 
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Wednesday, 14 November 2007 01:04:16 UTC