RE: Special characters in URIs

(I'm hoping that uri@bunyip.com will migrate to uri@w3.org,
although I've not gotten an acknowledgement. I suppose people
should look for news at http://www.ics.uci.edu/pub/ietf/uri )

> It "works" in the case that, for example, a user copies
> a filename from a desktop filebrowser into an XML document
> 	href="xyz__"
> where __ is some non-URL character.

This works for me if you say that what's in the XML document
attribute isn't really a "URI" but rather something else.
For example, we could use the "IURI" draft to define what
appears in XML, and note that in order to turn it into a URI,
it needs to be escaped. I don't have a problem with that.

> Meanwhile, the HTTP server, when it exports the xyz__ file,
> uses the same convention: UTF-8 encoding, %XX escaped.
> 
> That doesn't mean the HTTP server should grab xyz%XX%XX off
> the tcp socket and unescape it; it means the HTTP server
> should (do something equivalent to) enumerate each file
> in the directory and escape it, and compare the resultin URI path
> to xyz%XX%XX.

Right.

> It's a bit of a kludge; the cleaner thing to do would
> be to say "don't put things other than URIs in those
> XML attribute values." But we haven't had any luck doing that.
> And this "kludge" just so happens to be consistent with
> the existing specs (though subtly) and consistent with
> a fair amount of acutal practice (or at least so I
> gather from Martin; I haven't seen the evidence 1st hand).

This works for me too, I'd just like to get this into the
specs.
 
> And it provides a global convention for interoperability
> between HTTP servers exporting filesystems that use
> iso-latin-1 to encode filenames and those that
> export filesystems that use shift-jis or UCS-2.

I'm not sure how that works (the shift-jis part), and I wonder
if this deserves a fuller explanation.

The URL internationalizationd raft has been sitting around
for a long time; maybe it's time to move it forward now?

> Dan Connolly, W3C
> http://www.w3.org/People/Connolly/
> 

Received on Saturday, 29 May 1999 08:01:39 UTC