Re: [charmodReview-17] replacing all URIs with IRIs

Am Sonntag den, 26. Mai 2002, um 20:48, schrieb Aaron Swartz:

> re: n-triples supporting character escapes. Yes, my point is that 
> the software lags far behind the specs.
> On Saturday, May 25, 2002, at 05:51  AM, wrote:
>> RFC 2396, in specifying the use of %HH escaping, does not confine its
>> use to UTF-8.  There are plenty of URIs out there which use %HH to
>> escape other character encodings.  Once you have a %HH-escaped URI,
>> there is no way back, unless you know how it was created.  If an RDF
>> database contains some %HH-escaped URIs, how can anyone know whether
>> they arrived %HH-escaped, or whether the %HH-escaping was applied just
>> before their insertion in the database?
> I've heard some rumblings about updating RFC2396 to require UTF-8...
> But even so, why does it matter? The worst effect I can see is 
> that some (broken) URIs are displayed a little funny. Are software 
> going to be peeking into these URIs for some reason?

Think for example about a WebDAV file system. The fs driver needs
to convert back and forth between local filenames and server uris.

Think about a HTTP server sitting on a file system. Apache 1.3.x on
a windows box will convert euro signs in filenames to %80. Which
is neither ISO-8859-x nor UTF-8.

In order for WebDAV server and the fs driver to work together, they have
to agree on a charset for the URI encoding. Since charset parameters in
URIs are messy, UTF-8 seems the best choice.


Received on Tuesday, 28 May 2002 04:53:17 UTC