Date: Mon, 12 May 1997 12:20:19 +0200 (MET DST) From: "Martin J. Duerst" <email@example.com> To: Larry Masinter <firstname.lastname@example.org> cc: "Alain LaBont/e'/" <email@example.com>, URI mailing list <firstname.lastname@example.org> Subject: Re: "Difficult Characters" draft (in URLs) In-Reply-To: <337615DC.7C2F@parc.xerox.com> Message-ID: <Pine.SUN.3.96.970512120323.245P-100000@enoshima> On Sun, 11 May 1997, Larry Masinter wrote: > Martin, > > > "Keyboards exist" is not very helpful. If the market penetration of such > > keyboards is 10%, we better leave out UCAL; if it is 95%, we don't > > have to worry much. > > It surprises me to see you fall into the same kind of position > that -- in the larger scale -- was the argument for keeping > URLs to "ASCII-only". It shouldn't surprise you. URL transcribability is definitely an issue. The main point is to realize that when evaluating transcribability, it should be weighted with the number of potential users. So saying that Chinese will have difficulties to enter uppercase accented letters (UCAL) is irrelevant to whether they should appear in URLs intended for a French audience. Discussing whether and to what degree French will be able to input such letters, on the other hand, is very relevant. > What is the scope of "the market"? If "the market" is "Alain", > then the market penetration is 100%. If "the market" is "all > keyboards on the planet" then, of course, the "market > penetration" of keyboards that can type anything other > than simple ASCII is still quite small. The "market" may be very different for different URLs. But there will probably be a large number of URLs mainly addressing people in France (and other French-speaking areas), mainly due to the fact that the corresponding resources are written in French. Except for those codepoint sequences normalized away be the algorithms described in the draft, it is ultimately the responsibility of the URL creator to care about his/her market. The idea of the draft is to point out areas where for various reasons, there may be problems. > We are really talking about "character entry method" > rather than "keyboard" since, as has been pointed out, > with the "right software" it's possible to enter almost > any kind of character from almost any kind of terminal; True. If that's pen-based input or whatever, never mind. But there is a big difference between various methods in entry speed, keyboard entry in many cases being the fastest. If it were the case that a French user on average would take five minutes or more to enter an UCAL (the information from Alain indicates that the average is much lower), it would definitely be better to warn against such letters in French URLs. However, if it would take a Japanese user an average of five minutes to enter such a letter, that wouldn't bother us much (except for the unnecessary recommendation to not include UCAL in Japanese URLs). > and "market penetration" might want to be clarified > as to whether you're interested in the percentage of > "things that are being sold in the marketplace now" > or "existing, installed, usable", or at least some > forecast of the latter. It's definitely "existing, installed, usable" that counts. For URLs that you expect to stay longer, you can also take into account the future development. And for the draft, we of course should take into account the future development, because the draft should be reasonably valid for a certain time. Regards, Martin.