Re: Why hexify fragments?

* Chris Lilley wrote:
>Yes, you are right for the case where the IRI is converted to a URI and
>stored in the XML. I was thinking of the case where the IRI is stored
>directly in the XML and only hexified to cross the wire. But then I
>suppose its not "A new URI format" in that case .... or is it?

That's indeed not new resource identifier syntax, but I think such
protocol interactions are really orthogonal to the requirement. It
is for new URI syntax which requires that encoded character strings
be represented in a way compatible with URI syntax which requires
the use of %xx escapes if the conversion algorithm yields in octets
not representable using characters allowed in URIs. Remember that
the components in URIs and IRIs represent octets, not characters,
so

  data:text/plain;charset=utf-7,Bj+APY-rn
  data:text/plain;charset=utf-8,Bj%C3%B6rn
  data:text/plain;charset=utf-8,Björn

are legal IRIs that resolve to the same resource, but

  data:text/plain;charset=utf-7,Björn
  data:text/plain;charset=utf-8,Björn

while legal IRIs, do not. The same is true for fragment identifiers,
you could create a media type for which fragment identifiers do not
use UTF-8 / %xx-encoding, e.g., for application/x-foo-xml and 

  <!DOCTYPE foo [<!ATTLIST foo id ID #IMPLIED>]>
  <foo id = "Björn" href = "#Bj+APY-rn" />

you can require that the IRI Reference in href refers to <foo> as
identified by the ID in id as the fragment identifier syntax for
application/x-foo-xml is based on UTF-7 rather than UTF-8. So the
requirement is relevant even if no %xx escaping is involved.

>Yes, thats a good URI test. I will add it to the test suite.

Great!
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 

Received on Wednesday, 23 March 2005 18:56:00 UTC