- From: Martin J. Duerst <mduerst@ifi.unizh.ch>
- Date: Mon, 14 Apr 1997 16:21:18 +0200 (MET DST)
- To: "Roy T. Fielding" <fielding@kiwi.ICS.UCI.EDU>
- Cc: uri@bunyip.com
On Fri, 11 Apr 1997, Roy T. Fielding wrote: > >I reiterate that there is consensus on integrating text > >for UTF-8 as the recomended character encoding into the > >draft. > > That is a lie. Thanks for being explicit. Whether we have rough consensus to include the text for UTF-8 or not may be an open question in the absence of a group chair, but it should be very clear that we definitely have no consensus to exclude UTF-8 from the draft, as has been claimed. > >We have a proposed wording, two paragraphs which > >I don't think I need to repeat. > > That is true. I wrote that wording because your prior wording was > too confusing, not because I agreed with it. Again, thanks for being clear. From your communication up to now, I had to assume that you wrote that wording because after I had pointed out to you that UTF-8 was only *recommended*, not requested, made you turn your disagreement into agreement. Immediately after I received that wording, I sent a mail to the list saying how much I liked your wording and that we should go with it. This was 5 weeks ago. You should have received two copies of that mail, and it should have been rather obvious that I was assuming you agreed to your own wording. This is the first time I see anything to the contrary. > >I have only heard very > >general arguments against this wording, arguments which > >I have showed to be untrue or irrelevant. > > That is a lie. You have an opinion, Martin, and Larry has an opinion, > and I have an opinion. We all have our oppinions and tastes. I respect your oppinions, and I respect Larry's oppinions. But there is a difference between an oppinion and an argument. You hold the oppinion that URLs shouldn't contain anything else than ASCII. As your argument, you gave typability. I hold a different oppinion. I showed that with current technology, typability is no longer an argument (a missing local keyboard resource can be replaced by a Java applet), that many actual resource names (e.g. numbers on car number plates) contain characters beyond ASCII anyway, so that restricting URLs to ASCII for typability doesn't help anybody, and that in the end requiring URLs to be ASCII only is rather similar to requesting the web to give up GIFs and other images just because there are some people that can't see. > You did not show any of my arguments to be > untrue or irrelevent, See above. If you have any arguments to the above, I would be very interested to hear them. > and the only thing you have demonstrated is that > you think URLs should be treated as filenames. Well, I disagree. I never demonstrated that I think URLs should be treated as filenames, because I couldn't possibly do so. I do NOT think URLs should be treated as file names. It is true that I have in many cases used filenames as examples of URLs. In particular, I have spoken about filenames when you raised the concern that implementing the UTF-8 recommendation would not be easy to do on certain file systems. > The only question that matters is whether or not the draft as it > currently exists is a valid representation of what the existing > practice is and what the vendor community agrees is needed in the > future to support interoperability. As long as we are at IETF, what matters is the discussion here. If we decide that we have to restart at Draft Standard because otherwise some problems cannot be solved, then the criteria of course become different. As for "vendor community", we have heard clearly positive voices on this list from people from Sun and from Alis for the UTF-8 proposal. I did not see any negative voices from vendors. And I know from many other vendors that they would be happy to know how to encode all kinds of characters into URLs and how to decode the characters from the URLs, and that they look forward to the UTF-8 proposal being accepted. > I have yet to hear *any* support > for your additional requirements from the vendor community, Francois Yergeau, from Alis (a browser vendor), has been very explicit about this. > and I > know for a fact that they do not correspond to any existing plans > of the Apache Group. The Apache Group is a group of volunteers doing very nice work. And you are a core member of this group, so you will know. Up to now, nobody in that groups seems too much concerned with internationalization work, although Dirk van Gulik has given an excellent presentation on content negotiation for document encoding ("Accept-Charset") and document language ("Accept-Language"). In the last few days, I have had a closer look at the current Apache sources. My discovery of the rewrite module and of the concept of sub-requests has made me more and more inclined to volunteer to write a module that can handle various configuration cases for UTF-8 (per-server and per-directory native resource names, various upgrading strategies,...). My main question at the moment is not whether this is technically feasible, but whether I will get some advice by experienced Apache people. > Since it is my opinion that it is NEVER desirable > to show a URL in the unencoded form given in Francois' examples, > you cannot claim to hold anything even remotely like consensus. > In fact, the "rough consensus" of the HTTP development > community is that the URL namespace belongs to the origin of the name, > and no client has the right or need to reinterpret that name for > the purpose of display. That is what the current draft says, URLs don't belong to the HTTP development community. As for "right", it more or less burns down to the question of what you call an URL and what you call a presentation of an URL. As for "need", it's not the technical community that is deciding this. > and it > does so in a way that DOES NOT PREVENT any future use of URLs to be > of a single character set encoding. We are not discussing here about "NOT PREVENT"ing. We are discussing about ENABLING. > IF you can persuade the creators of URLs to always use UTF-8, ^^^^^^ <OBNOXIOUS> Do you forget so quickly that it is only a *recommendation*? </OBNOXIOUS> > which > is definitely not the case today (Apache, NCSA, and CERN servers all > use whatever charset is used by the underlying filesystem, which on > most Unix-based systems is iso-8859-1 or iso-2022-*), <DETAIL> The overwhelming majority of filesystems in places that could use iso-2022-* (Japan,...) don't use that; they use EUC. Encodings such as iso-2022-jp are used only in email. </DETAIL> > then you can > make claims of consensus. Until then, your opinions have been answered > to the best extent possible by the editors, and with far more > civility than in your responses. Please answer my arguments. My opinions are irrelevant. Civility is a (secondary) issue. You seem to consider shouting "That is a lie." when you should know better to be civil. I have the tendency to be less direct, which is considered more civil in many societies. This is internationalization at work :-). Regards, Martin.
Received on Monday, 14 April 1997 10:23:35 UTC