Escaping of URI References

Hi,

The current XQuery 1.0 and XPath 2.0 Functions and Operators Working
Draft states, that applying the string() function to an instance of a
anyURI data type results in a string without any URI encoding applied
to the anyURI for compatibility reasons. It further notes URI escaping
should be under user control. However, there is no function defined for
URI escaping. While I do not care too much about escaping space
characters, it is a major shortcoming if one is not able to convert a
anyURI to a URI. Consider you have an XML Schema based XML document like

  ...
  <link href='http://www.hoehrmann.de/~björn/' />
  ...

The href attribute in this example is an anyURI. anyURIs allow IRI
References, hence the anyURI is valid. If I now want to transform this
document to XHTML, which uses URIs instead of anyURIs, I can do
something like

  <xsl:template match='link'>
    <xhtml:a href='{@href}'><xsl:value-of select='@href'/></xhtml:a>
  </xsl:template>

but I will get

  <a href='http://www.hoehrmann.de/~björn/'
    >http://www.hoehrmann.de/~björn/</a>

This is invalid XHTML since the 'ö' is disallowed in URI References. The
desired output would be

  <a href='http://www.hoehrmann.de/~bj%C3%B6rn/'
    >http://www.hoehrmann.de/~björn/</a>

But is is not possible to generate this fragment using the function set
provided by the draft. A possible solution is to add a new function
'xf:anyURI-escape' or 'xf:anyURI-toURI' that converts an anyURI to an
URI as defined in section 3.2.17 of XML Schema Part 2. However, I do not
like this solution, always using this function makes style sheets, XPath
querys, XQuerys, etc. rather hard to read, consider

  <xsl:template match='link'>
    <xhtml:a href='{xf:anyURI-to-URI(@href)}'
     ><xsl:value-of select='xf:anyURI-to-URI(@href)'/></xhtml:a>
  </xsl:template>

real ugly compared to the above. XSLT 2.0 could add some convenience
method to perform this conversion implicitly. This could be as easy as
adding a new attribute to the xsl:output element if it is considered
that documents will allow either URIs or anyURIs. However, such function
is necessary.

regards.

Received on Thursday, 11 July 2002 15:30:27 UTC