anyURI values: escaped or not?

In the description of anyURI, the spec first says:

>The mapping from anyURI values to URIs is as 
>defined by the URI reference escaping procedure 
>defined in Section 5.4 Locator Attribute of [XML 
>Linking Language]

This implies that the anyURI value corresponding to a given URI must
have its URI-illegal characters *unescaped*, since the mapping referred
to surely must escape any "real" percent signs.  If some or all of the
URI-illegal characters are already escaped, how is an implementation
to know this?

On the other hand, shortly thereafter the spec says:

>Note:  Spaces are, in principle, allowed in the 
>·lexical space· of anyURI, however, their use is 
>highly discouraged (unless they are encoded by 
>%20).

This implies that spaces are best escaped in the lexical representations;
the WG has asserted that the lexical mapping is the identity, so this
seems to be saying that at least spaces *should* be pre-escaped.

There needs to be an explanation of what is to be done here.
-- 
Dave Peterson
SGMLWorks!

davep@iit.edu

Received on Thursday, 12 May 2005 15:57:38 UTC