- From: Joseph Reagle <reagle@w3.org>
- Date: Mon, 30 Jun 2003 16:14:12 -0400
- To: Jeremy Carroll <jjc@hplb.hpl.hp.com>, Martin Duerst <duerst@w3.org>
- Cc: Graham Klyne <gk@ninebynine.org>, Dan Connolly <connolly@w3.org>, w3c-i18n-ig@w3.org, "Ralph R. Swick" <swick@w3.org>, misha.wolf@reuters.com, Tim Berners-Lee <timbl@w3.org>, w3c-rdfcore-wg@w3.org
On Monday 30 June 2003 15:46, Jeremy Carroll wrote: > We lose xml:lang by using exc-c14n out of the box ... viz: > [[ > attributes in the XML namespace, such as xml:lang and xml:space are not > imported into orphan nodes of the document subset > ]] I'll note that while the issue of using exc-c14n arose out of my Last Call comments, I haven't insisted on use of exc-c14n. I merely noted that the specs specified c14n in some places, and exc-c14n in others and asked they be consisted and recommended exc-c14 [1,2]. exc-c14n seems to be the preferred choice now-a-days since it permits subsets of XML to be easily be moved beyond contexts and it can be implemented a bit more easily. The ease comes from the fact that the ancestor nodes of the subset don't have to be crawled (or kept while parsed) to get ancestor xml attributes (e.g., xml:lang). (Ancestor namespace declarations are always in a descendent's axis, but xml attributes must be crawled.) However, I also resisted calls for c14n to be completely deprecated. If people care about context, it's a good choice for them. > Because of this, in the LC docs we had a complicated and confusing > work-around that involved putting the xml-literal inside an <rdf-wrapper> > tag, whose sole purpose was to hold the xml:lang attribute. It is > certainly less confusing to have ditched all of that. Right, this is suggested in: | http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/#sec-Limitations | The XML being canonicalized may depend on the effect of XML namespace | attributes, such as xml:lang, xml:space, and xml:base appearing in | ancestor nodes. To avoid problems due to the non-importation of such | attributes into an enveloped document subset, either they must be | explicitly given in the apex nodes of the XML document subset being | canonicalized or they must always be declared with an equivalent value in | every context in which the XML document subset will be interpreted. XML is sufficiently complex beast that I've felt that there will never be a perfect "canonical" representation. I'd like if XML were simple enough that there could be, but until then, the serialization is really an applications decision, and our goals was to provide a sufficiently expressive and small set of serializations that met the majority of the applications requirements. In this case you can use: 1. c14n 2. exc-c14n and do a "apex node" restatement 3. exc-c14n and forget the xml attributes [1] http://lists.w3.org/Archives/Public/www-rdf-comments/2003JanMar/0171.html [2] http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2003JanMar/0030.html
Received on Monday, 30 June 2003 16:15:08 UTC