[Prev][Next][Index][Thread]

Re: Using unicode or MBCS characters in forms



Gavin,

I didn't mean to impugn your character, and apologize if my note might
have sounded that way. However, RFC 1945 (HTTP/1.0, Informational) says:

>   An entity body is included with a request message only when the
>   request method calls for one. 

and there is no specific "calling for" an entity with a GET method.

In the HTTP/1.1 specification (draft-ietf-http-v11-spec-05) we
attempted to be more specific about this, and in section 4.3 it says:

> The presence of a message-body in a request is signaled by the inclusion
> of a Content-Length or Transfer-Encoding header field in the request's
> message-headers. A message-body MAY be included in a request only when
> the request method (section 5.1.1) allows an entity-body.

Now, within the current draft, it does not say that GET allows an
entity-body, so I might be able to escape your indictment ("Unless you
can point to a specific section that specifically disallows this")
since this section 4.3 sentence specifically disallows a entity-body
without such explicit permission.

However, there is not an explicit acknowledgement that POST and PUT
allow entity bodies either. I can remember clearly the editorial
discussion where we (at least in the editorial committee) were clear
about the consensus that GET should not allow entity bodies and PUT
and POST should, but I agree that the HTTP/1.1 specification should be
clearer about this.

You might also note that HTTP/1.1 specifically allows POST results to
be cached when the origin server indicates its cachability.

As for the issue of "International URLs", you say:

> Myself and others *have* made proposals. I also find quite distasteful
> your use of "alternate universe". I believe that concerted effort can
> also solve this problem, but so far, no effort has resulted in
> consensus. 

I know there have been 'proposals', but the contradiction lies in that
there seems to be no way to have a designator that is both UNIFORM
(everyone who might use it might also type it in the same way and
write it the same way) and that is also INTERNATIONAL (those who wish
to use East Asian, Arabic, or even just western European accented
characters might be able to use those characters in a designator) by
the simple observation that UNIFORM basically implies "least common
denominator", and that while almost all keyboards of the world do have
a way to enter a limited repertoire of roman characters and a few
punctuation marks, the least common denominator does not go beyond
that.

An "alternate universe" would be one in which the capability of typing
URL strings into systems would be globally enhanced to the point that
the currently impoverished repertoire of URL strings might be
enhanced.

An alternative is to remove the requirement of 'UNIFORM' and have some
other kind of localized resource locators which might have alternate
forms; e.g., Japanese URLs might alternatively also be available by
entering the romanization of the name of the resource as well as the
Kanji version.

I think you might be able to form some consensus to deploy such an
identification system around the Internet, although it would not fit
within the current mechanisms of URLs. It's unfortunate that the
groups currently working on URNs -- a likely point of attaching such
internationalization efforts -- have not, to my knowledge, considered
the internationalization consequences of their work.

Larry



Follow-Ups: References: