Re: revised "generic syntax" internet draft
Jon Knight (jon@net.lut.ac.uk)
Wed, 16 Apr 1997 11:04:23 +0100 (BST)
Date: Wed, 16 Apr 1997 11:04:23 +0100 (BST)
From: Jon Knight <jon@net.lut.ac.uk>
To: Gary Adams - Sun Microsystems Labs BOS <Gary.Adams@east.sun.com>
Cc: uri@bunyip.com, fielding@kiwi.ICS.UCI.EDU, Harald.T.Alvestrand@uninett.no
Subject: Re: revised "generic syntax" internet draft
In-Reply-To: <libSDtMail.9704151153.13046.gra@zeppo>
Message-Id: <Pine.SUN.3.95.970416104120.6402N-100000@weeble.lut.ac.uk>
On Tue, 15 Apr 1997, Gary Adams - Sun Microsystems Labs BOS wrote:
> Using the HotJava browser yesterday to view
>
> http://www.alis.com:8085/~yergeau/url_utf8.htm
>
> I was able to manually select the "View"->"Character Set" -> "Other" -> UTF8
> and see the accented characters in the document text as well as in the
> presentation of the URL. This worked for the 8bit UTF8 bytes, but was
> not implemented for the %HH escaped characters. This would be a very
> useful feature to support in an I18N browser.
A few more datapoints on the above URL:
* Netscape Navigator 3.01 for X11 running under SunOS 4.1.4 (as
are all the tests below) displays both that page and the two pages
linked to from it (or is it one page with two different URLs? Whatever
- they both get displayed). One of the URLs has lots of accented
characters in which get displayed in the URL window, the above
document's text and in the bottom left hand corner when the cursor is
over the appropriate URL in the above document (Netscape is set to have
a document encoding of "Japanese (auto-detect)" by the way).
* X Mosaic 2.7b5 doesn't work with the above page or the pages linked to
from it. As far as I can tell, this is because there is a charset
attribute following the "text/html" on the Content-Type header; I think
this is confusing it.
* Telnet (yes, I use telnet to get HTML pages once in a while) can
retrieve the page linked to above. However cut'n'paste under X11R6
doesn't cut'n'paste the non-ASCII characters for me so the I18N'ed URL
can't be cut'n'pasted (either from Netscape's URL window or from the
document that telnet returned in an xterm). I notice that the web
server is returning the charset attribute even though I'm making an
HTTP/1.0 request. Is that right? I thought thinks like charset were an
HTTP/1.1 thing?
* Lynx version 2.7.1 blows up spectacularly on the above URL, most likely
because of the charset parameter on the Content-Type header again (it
complains that "Start file could not be found or is not text/html or
text/plain" after dumping the raw HTML out to the screen). The
document with the %-escaped URL suffers the same fate but the I18N
version can't even be cut'n'pasted and I've no idea how to generate all
the accented characters on my keyboard.
* The CERN line mode browser v3.0 blew up on the above URL with a failed
system call after complaining that it couldn't display it.
As I say folks, just some more datapoints, interpret as you will.
Tatty bye,
Jim'll
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Jon "Jim'll" Knight, Researcher, Sysop and General Dogsbody, Dept. Computer
Studies, Loughborough University of Technology, Leics., ENGLAND. LE11 3TU.
* I've found I now dream in Perl. More worryingly, I enjoy those dreams. *