Re: Using unicode or MBCS characters in forms

>> Anyway, my statement, I believe, is correct.
>I believe that the intent and consensus of the HTTP working group is
>that message bodies are disallowed in GET and HEAD methods, and that
>if the HTTP/1.1 spec doesn't spell this out carefully enough, it's an
>editorial issue which should be corrected. 

Agreed. Anyway, the intent and consensus is not clearly spelled out in
the *standard*, which was the fundamental aspect of my statement,
which I stand by. In fact, I can accept the lack of bodies for GET
(even though I'd like it), so perhaps the editors of the HTTP spec
should just change the BNF slightly, and alter the prose for GET and

In that case I wouldn't have you wrongly accusing me of encouraging
implementation of features that are not compliant with WWW standards. 

>Gavin, there *IS* no layer. That's the problem. There's no protocol
>layer or implementation layer between "what's printed in the
>newspaper" and "what the user types into the keyboard" except
>wetware. "See URL", "Type URL". If the URL I see contains "Franc,ois"
>and the keyboard I'm sitting at doesn't have a way of entering a
>"c-cedilla", then I can't type in the URL. Period. I don't know how
>you can "deal with this" or "make it transparent".

Don't type in the c-cedilla, type in an encoded representation of it
(which is what would be sent down the wire). Now if you happen to have
a keyboard that allows you to type in something like this, then I
cannot see why the UA might not, as a convenience, allow you to type
it in with the c-cedilla. It would then the the UA's responsibility to
enure that the "on-the-wire" format is the same. Also, the wetware to
typing transition does involve a layer, or more than one.

>So this isn't a problem "for later", unless you mean "when the world's
>keyboards are upgraded". Now, that itself might happen for Franc,ois
>sometime in the next five years, but not for Akio-san.

I don't believe this is correct. For one thing, this isn't a keyboard
problem (many Japanese users use 101 keyboards, and I myself often
do). It is more of a matter of deciding upon a single syntax for
encoding definition, or deciding upon a single encoding to be
used. Admittedly, the URL's might be ugly, but that's it.

Also, my reference to transparency is that at some point in the
future, users should never have to type in a URL.
>Now, again, you might say that some people could type in one way and
>other people could type in another way, and that's fine, but now the
>URLs are no longer UNIFORM: some people see one URL and other people
>see another URL. That's OK, too, but you must be explicit about that,
>that you're defining Non-Uniform Resource Locators.

As I said, of one decides upon a single encoding standard, or a single
syntax for defining the encoding used, then you do have a UNIFORM
resource locator, because there will be no differences, even at the
octect level. Non-Uniform (to use your words) resource locators could
be just a layer on top of this.

Follow-Ups: References: