- From: Keith Moore <moore@cs.utk.edu>
- Date: Fri, 11 Aug 1995 16:08:56 -0400
- To: Martin J Duerst <mduerst@ifi.unizh.ch>
- Cc: sollins@lcs.mit.edu (Karen R. Sollins), moore@cs.utk.edu, FisherM@is3.indy.tce.com, uri@bunyip.com
> My impression is that some of the members of this group still have > that discussion too deep in their bones so that they are unable to > recognize that the way the implementation and the mapping > of specific schemes was done (allowing full English text) has > greatly jeopardized their original intentions. No, that's quite easy to recognize. But the current use of URLs weren't specifically designed to be English-centric; it's a direct consequence of the implementations of FTP, Gopher, and HTTP, and other protocols on which URLs were based. > URLs have a user-friendly character set; the problem is only that > this user-friendliness is limited to English-speaking people. > People use this facility to encode as much semantics as possible. Anybody, regardless of language, is going to name their files using their own language. The problem isn't that people use filenames to encode semantic information; the problem is that these filenames get exported to the rest of the world (and that this favors some people more than others). I'm not going to stop using English words in my filenames. But I am currently building tools to do publishing, cataloging, location, replication, etc., that don't use the original filename in the published URL. One of the reasons for building such tools is to fix one of the causes of the "stale URL" problem -- the use of filenames as external document identifiers is a big part of what causes URL lookup failure. If we want to solve this problem (and I think we do), then we're going to stop using filenames anyway. But rather than moving away from the filenames that cause us these problems, you're trying to figure out how to add more baggage so we can keep using them. Not only does this not solve the "stale URL" problem, it drastically increases the probability of transcription errors. I have a hard time seeing this as a step in the right direction. (Actually, I'm afraid that Karen is right -- we may well have to punt transcribability in the long run.) > For those who were not really aware > of the issues of extended character sets for multilingual purposes, > it was fully user-friendly from the beginning. Some of us who oppose user-friendly URLs DO understand the issues, because we've seriously looked at ways to solve this problem, and the potential disaster that poor solutions might cause. That's why we oppose them, or are at the least very skeptical. > If you had stayed with that, okay (with the footnote that in Hebrew and > Arabic, consonants only are written in general text :-). As I have not > taken part in the discussion, I can only guess, but my guess is that > most of the people at that time indeed felt that this would be too > clumsy, that they wouldn't like to transcribe their usual file names > into something such as "l4c5r7g7mtn8thd". And now they are arguing > against trying to address the same bad feeling and disliking that > they cleverly managed out of the way for themselves, but that > the greater part of the world is still faced with. Look, this isn't a cultural or language bias issue. It's just another Internet scaling issue. The more different things you try to hook together, the more interoperability problems you have to solve. We do need to solve this problem, but the trick is to do so in a way that doesn't make the overall situation worse. > It may be unfair to many of you on this group to make such direct > accusations, but for me there is too big a conflict between the > official > "we agreed that we were not going for user friendly names" > and the actual, implicit: > Let's care for us; we don't give a damn about the rest of > the world. > that people dealing with mulitlingal matters find in the present > URL scheme. That accusation could go in either direction. As in: "we don't give a damn about how well the web works in general, so long as it lets us use our filenames". It's probably true that we each try to solve the problems that hinder us most. Using ASCII filenames as document identifiers doesn't cause me as many (immediate) problems as it does for you, so you have a greater interest in fixing them (soon) than I do. I want to fix them too. But maybe because the filename problem affects you more than it does me, I'm more aware of other problems that also need fixing. I still think we'd be better off if we tried to find a single solution to both of the problems with URLs based on filenames, rather than choosing a solution for your favorite (and worthwhile) problem that makes the overall situation worse. Keith
Received on Friday, 11 August 1995 16:10:11 UTC