- From: Gavin Nicol <gtn@ebt.com>
- Date: Fri, 21 Jun 1996 02:15:46 GMT
- To: erik@netscape.com
- Cc: JMHX.DSKPO33C@dskbgw1.itg.ti.com, www-international@w3.org
>We do send Accept-Language, if the user sets it. I use 2.01 under Unix. How do I set it? >Assuming "POSTed data" refers to forms, I haven't seen very many forms >asking for an ENCTYPE of "multipart/form-data". The most common enctype >appears to be "application/x-www-form-urlencoded", the default. I >suppose we could add "; charset=xxx" to the content-type header, though >things appear to work as they are (i.e. the client sends the results >back in the same encoding as the original form). It would be *very* useful to have the client add a charset parameter (see below). >Can all servers deal with an appended charset parameter? We would >need to investigate this before adding charset. Discussing these >things on this mailing list is all very well, but actually making >changes to our software requires a lot of care. This "bugward combatibility" is one of the primary reasons things haven't changed. This and the "well it seems to work now" attitude. You could at least make it an option. >> As it is now, any data recieved must be sniffed to figure out what it >> is. Not very useful on a site that could get queries in both EUC-KR >> and EUC-JP... even shift-jis and EUC can be mistaken. > >I suppose it would be nice if people could submit EUC-KR data even if >the original form is in EUC-JP or ISO-8859-1. Currently, people seem to >get by with results sent back in the same encoding as the form itself. >Haven't heard too many complaints about this. What happens if you have, on a single site, many different forms in many different encodings? What happens if the forms are dynamically generated, where you do not know a priori what the encoding of the form is/was? Then you have to rely on data sniffing, in which case it is not easy to distinguish EUC-KR and EUC-JP. Data sniffing would also be simplified if a single encoding was choosen for each language. The current situation can be made to work if you assume a single language, and a single primary encoding. It fails when you try to create truly multilingual sites. >> It is more than a year and a half since I pointed this out. > >Right, so I guess this is not really a much demanded feature. I don't think so. The I18N discussions are quite old now, and are probably the one area where software vendors are almost uniformly poor. Until recently, the number of web sites in Japan was small, and now they are exploding. The problems we have now will be magnified many times over. >Under Options -> Document Encoding, you will find a list of charsets. >The actual spelling of the various charset names can be found in > > ftp://ftp.isi.edu/in-notes/iana/assignments/character-sets So do you recognise EUC-JP (which I believe is not in the IANA list, except as Extended_UNIX_Code_Fixed_Width_for_Japanese). Also, even when I send the parameter, why does the document info forms tell me the encoding is iso-20220-jp?
Received on Thursday, 20 June 1996 22:17:43 UTC