- From: Chris Lilley <chris@w3.org>
- Date: Sat, 23 Apr 2005 00:17:51 +0200
- To: Richard Tobin <richard@inf.ed.ac.uk>
- Cc: www-tag@w3.org
On Wednesday, April 20, 2005, 6:50:49 PM, Richard wrote: RT> The XML Core WG requests advice about the use of base URIs in the RT> XML Infoset, following the publication of RFCs 3986 (URIs) and RT> 3987 (IRIs). RT> System identifiers in XML have always allowed characters that need to RT> be escaped before the identifier is used as a URI. In fact, XML RT> allows what are now called IRIs, with the addition that XML requires RT> implementations to support the (optional in RFC 3987) escaping of RT> SPACE etc. RT> The same applies to xml:base attributes and "other XML strings meant RT> to be used as URI references" - though it is not entirely clear what RT> the latter means. The cumbersome phrase can usefully be replaced by "IRI". RT> As far as using these identifiers to retrieve documents goes, there is RT> no problem. The escaping and absolutization rules produce the same RT> results with the new RFCs as with the old one. But the XML Infoset RT> exposes the base URI itself, and there are two issues with that: RT> (1) Does %-escaping happen before or after the base URI is calculated? RT> The XML spec currently says that escaping should happen "as late RT> as possible" (because it is not reversible); that seems to imply RT> that the base URI should not have escaping done, in which case RT> strictly speaking the base URI is not a URI. Right. The base is an IRI, and it can be combined with a relative IRI to produce and absolute IRI which then (for example, for dereferecing over a transport that does not directly support IRIs) may need to be escaped. RT> (2) RFC 3986 changes the algorithm: the base URI now has any fragment RT> component stripped, whereas it didn't before. Should we amend our RT> specs to require the new behaviour? Yes. In practice I suspect that a) there are few base URIs with a fragment in the wild b) that implementations probably differ when faced with http://example.org/toto/blah.xml#foo and "../bar" You may want to have a look at the very brief, but useful, http://www.w3.org/TR/2004/CR-charmod-resid-20041122/ if you have not already. -- Chris Lilley mailto:chris@w3.org Chair, W3C SVG Working Group W3C Graphics Activity Lead
Received on Friday, 22 April 2005 22:17:56 UTC