Re: UTF-8 Encoding Question

At 3:11 PM -0500 3/1/01, John Stracke wrote:
>Dan Brotsky wrote:
>
>>  >   <D:href 
>>D:original-charset="utf-8">http://some.host/C%C3%A9sar.txt</D:href>
>>
>>  This certainly would be a helpful thing for servers to say to clients
>>  (although maybe in some other way), and I second your request to JimW
>>  about adding it as an issue.
>
>It might be better to wait and see whether IRIs take off
>(<draft-masinter-url-i18n-07.txt>; broadly, the approach is to 
>define that IRIs
>are like URLs, but in Unicode; to use an IRI in a context that 
>demands a URL, you
>encode it in UTF-8, then apply %-encoding as normal).  No sense creating two
>separate mechanisms to solve the same problem.

I'm a big fan of IRI's (as you can tell from my earlier emails) but I 
think the issue here is the one of what to do when clients submit 
URIs with other encodings that the server returns to them in a body. 
Even when servers go to IRI discipline for URLs they generate, they 
shouldn't necessarily break clients who expect to get back what they 
asked for :^).

Here's another way of phrasing this issue that makes it not be about encoding:

The DAV spec says that it is *resources* that have properties, not 
*urls*, and that many different urls can refer to the same resource. 
When a client requests info about a resource by using a particular 
URL, but the server sends back information about a resource named by 
another URL, what guarantees does the client have about the returned 
URL:

1. Is it guaranteed to refer, in the context of this response, to the 
resource that the client asked about in the request?
2. Is it guaranteed *always* to refer to that same resource?  (In 
some sense, is there the same info in this response that one could 
glean from a permanent redirect?)

In this formulation, note that differences in %-escaping conventions 
can lead to different URLs that encode the same octet-stream.  Are 
there special conventions that apply in this case?

     dan

Received on Thursday, 1 March 2001 17:09:55 UTC