- From: Jeremy Carroll <jjc@hplb.hpl.hp.com>
- Date: Fri, 19 Oct 2001 17:35:10 +0100
- To: <w3c-rdfcore-wg@w3.org>, <w3c-i18n-ig@w3.org>
Mark Davis: > If the character % were itself escaped, then escaping *would be* fully reversible. Hmmm, not if you don't know the charset of the original character sequence. I seem to remember an example of a non UTF-8 URL in charmod. === My take on the erratum at http://www.w3.org/XML/xml-V10-2e-errata#E26. is that RDF needs to specify that for RDF/XML documents the RDF processor should escape the URI as soon as it can (i.e. just after it gets it from the XML processor, or straight after turning a relative URI into an absolute one, whichever happens later). i.e. the RDF needs are diammetrically opposed to the XML solution. The reason for this is that URI equality is important in RDF. The realistic algorithm for URI equality is binary comparison, and this only works by determining a normalized form for URI's. Because of the one-way nature of URI escaping (see above) it is necessaary to normalize to the fully encoded form (with uppercase hexadecimal escapes) rather than the fully unencoded form. I think that the internal representation of international URIs in RDF should be US ASCII RFC 2396 URIs. For RDF/XML output, and other human display, we could suggest that applications should make best efforts to reverse the escaping, with the exception of the % character and any that are not well-formed UTF-8. Jeremy
Received on Friday, 19 October 2001 12:35:45 UTC