- From: Stefan Eissing <stefan.eissing@greenbytes.de>
- Date: Tue, 28 May 2002 10:52:41 +0200
- To: Aaron Swartz <me@aaronsw.com>
- Cc: Misha.Wolf@reuters.com, www-tag@w3.org
Am Sonntag den, 26. Mai 2002, um 20:48, schrieb Aaron Swartz: > re: n-triples supporting character escapes. Yes, my point is that > the software lags far behind the specs. > > On Saturday, May 25, 2002, at 05:51 AM, Misha.Wolf@reuters.com wrote: >> RFC 2396, in specifying the use of %HH escaping, does not confine its >> use to UTF-8. There are plenty of URIs out there which use %HH to >> escape other character encodings. Once you have a %HH-escaped URI, >> there is no way back, unless you know how it was created. If an RDF >> database contains some %HH-escaped URIs, how can anyone know whether >> they arrived %HH-escaped, or whether the %HH-escaping was applied just >> before their insertion in the database? > > I've heard some rumblings about updating RFC2396 to require UTF-8... > > But even so, why does it matter? The worst effect I can see is > that some (broken) URIs are displayed a little funny. Are software > going to be peeking into these URIs for some reason? Think for example about a WebDAV file system. The fs driver needs to convert back and forth between local filenames and server uris. Think about a HTTP server sitting on a file system. Apache 1.3.x on a windows box will convert euro signs in filenames to %80. Which is neither ISO-8859-x nor UTF-8. In order for WebDAV server and the fs driver to work together, they have to agree on a charset for the URI encoding. Since charset parameters in URIs are messy, UTF-8 seems the best choice. //Stefan
Received on Tuesday, 28 May 2002 04:53:17 UTC