Re: "Difficult Characters" draft

Leslie Daigle (leslie@Bunyip.Com)
Thu, 8 May 1997 11:06:34 -0400 (EDT)

Date: Thu, 8 May 1997 11:06:34 -0400 (EDT)
From: Leslie Daigle <leslie@Bunyip.Com>
To: Keld J|rn Simonsen <>
cc: "Alain LaBont/e'/" <>,
Subject: Re: "Difficult Characters" draft
In-Reply-To: <>
Message-ID: <>

On Wed, 7 May 1997, Keld J|rn Simonsen wrote:
> > A 11:23 97-05-07 +0200, Martin J. Duerst a =E9crit :
> > >	"Copy it exactly, with case and everything."
> > >is much more user friendly, because it is the only one that
> > >works consistently.
> > 
> > I can agree with that. I think everybody can agree with that.
> I also agree,  exact match should work in all cases.

Yes -- the point I was trying to make earlier was that exact match is
about the _only_ thing that can be _mandated_  -- because there are no
globally (across countries of one language, across languages) consistent
rules for (potential) equivalence of characters.  

Quite apart from the issue of how individual languages shape expectations
of equivalences between letters (you will find words starting with "W"
under the letter "V" in some Swedish dictionaries), there are matching
conventions that have grown up around specific letters to _accommodate_
the various realities that have been faced in transcribing words.  For
instance, failing to find something under "ström", a Swedish searcher
might expect to also search for "strom", or even "stroem" (not because
they are right -- because they are common transcriptions).

These things are in the realm of applications and services -- NOT equivalence
in URLs.  

A url:öm  

is not the same as  

If the URL spec says that these are not equivalent URLs, then it is perfectly
valid to have them refer to 3 different resources.  It might be "good practice"
to suggest people do otherwise, but there are so many such possibilities
that it is well out of the range of what should be considered equivalence
rules for URLs.



  "_Be_                                           Leslie Daigle
             where  you                           
                          _are_."                 Bunyip Information Systems
                                                  (514) 875-8611
                      -- ThinkingCat