- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Sat, 04 May 2002 19:06:53 +0200
- To: "Roy T. Fielding" <fielding@apache.org>
- Cc: <LMM@acm.org>, <hardie@oakthorn.com>, <uri@w3.org>, "'Tim Berners-Lee'" <timbl@w3.org>
* Roy T. Fielding wrote: >On Wednesday, May 1, 2002, at 01:27 PM, Larry Masinter wrote: > >> Trying to redefine "URI" as the "same" protocol element >> leads to insanity, since there's no versioning. >> The only way of cutting the knot (after several years of >> discussion) was to be clear that an "IRI" was a different >> protocol element as a "URI". > >I don't understand. The vast majority of stuff in IRI is simply how >to display one. We don't need to include that. The only thing I want >to include is the default: %xx means the character encoded as xx in >UTF-8. That is already the default for MSIE and should be for other >browsers as well, and will simplify the specification. I disagree. While it's the default in MSIE for URIs, the user enters into the address bar, it's not the default for the vast majority of %xx encoded octets requested by MSIE, they originate from HTML forms where MSIE uses the document or user selected character encoding scheme to generate the octets, hence most %xx encoded octets representing non-ASCII characters are not part of valid UTF-8 sequences. There is no facility to define any other encoding than UTF-8, hence applications assuming UTF-8 encoding are said to fail.
Received on Saturday, 4 May 2002 13:07:41 UTC