Re: Human Readable Resource Identifiers - Unwise characters

/ Alessandro Vernet <avernet@orbeon.com> was heard to say:
| Hi Norm,

Hi Alessandro,

| We implement your HRRI spec [1] in Orbeon Forms, 

The XML Core WG has agreed to work with the I18N folks to get a normative
definition of Legacy Extended IRI (LEIRI) into the next IRI draft, see
http://www.ietf.org/internet-drafts/draft-duerst-iri-bis-02.txt

That's the place to look now for normative text about what characters
are allowed.

| and I am running into
| a case where I have a '[' character which is not escaped (per the
| HRRI) but that Apache HttpClient does not like. The conflict seems to
| be that HRRI defined unwise characters as:
|
| {" #x7B, "}" #x7D, "|" #x7C, "\" #x5C, "^" #x5E, and "`" #x60
|
| And Apache HttpClient uses the following list for unwise characters:
|
| "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`"
|
| The above list could come from RFC 2396 [2] (not sure if this is also
| specified in other places as well), which seems to be an older
| specification. I am now wondering if:
|
| 1) This is a bug in HttpClient (which should accept '[' in a URI) or
| 2) We should add '[' to the list of characters we escape per HRRI
| before passing down the URI to HttpClient.
|
| Do you have any guidance for us here?

Well. First, note that we're going from "URI-ish strings" to "IRI",
not URI. To get to a URI, you may need to do more work. That should
all be clear(er) now that the HRRIs have been replaced by LEIRIs in
the IRI spec.

I don't think you can automatically escape all "[" and "]" because
they turn up in IPv6 addresses. I think.

I hope that helps.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com> | We cannot put off living until we are
http://nwalsh.com/            | ready. The most salient characteristic
                              | of life is its coerciveness: it is
                              | always urgent, 'here and now' without
                              | any possible postponement. Life is
                              | fired at us point blank.--José Ortega Y
                              | Gasset

Received on Friday, 2 May 2008 13:17:36 UTC