[Prev][Next][Index][Thread]

Re: Translation Characters




>Well, it's happening to us again.  Our users are having trouble talking to
>WAIS servers through an HTTP gateway because the translation routines in
>libWWW seem to overtranslate punctuation characters.  For instance, I noticed
>that the library oftens translates a colon into %3A.  This causes problems 
>when an HREF points to a URL through another machine, such as in a HTTP -> WAIS
>passthru, i.e.
>
>http://foo.com/foo1.com:8000/...

The translation is correct... The problem lies in the proxy translator. Strictly 
speaking it should perform all the match and map operations on the translated 
URI. It probably doesn't because that part of the library is very old indeed.


>I've read the http specifications; while they describe how to translate 
>characters, there appears to be no description of which characters are to be
>translated.  One would hope that the various servers would know what to do
>with the translated characters, but often that is now always the case.  I've
>modified the library before so that our CERN proxy would no longer translate
>dollar signs;  Can anyone give me a definitive list as to what should be 
<translated and what shouldn't

You should have modified the VMS server to accept translated characters instead. 
$ really should have been included in the accepable URI characters. Not only is 
it a valid VMS filename character but together with _ can be used to make 64 
charaters - enough to do base64 encoding. UNIX scarcely makes a good benchmark 
in this case since every character except NULL seems to be allowed in filenames.

I think the URI spec has the details on what is permitted in URIs. As I remember 
though it was alphanumerics, period and that was about it.


		Phill Hallam-Baker

References: