- From: George Spafford <george_spafford@lionbridge.com>
- Date: Thu, 16 Sep 1999 08:11:38 -0400
- To: <ftang@netscape.com>
- Cc: www-international@w3.org
Thank you for the clarification. It sounds like we may need to do something via Javascript and perhaps a control to make the validation process happen. Thank you Frank. --G-- At 02:42 PM 9/15/99 -0400, ftang@netscape.com wrote: >--------------CD1AB6608350DF6C7ADD0107 >Content-Type: text/plain; charset=us-ascii >Content-Transfer-Encoding: 7bit > >The name "accept-charset" itself is very misleading. The origion of the name >"Accept-Charset" is from HTTP 1.1 protocol. The Accept-Charset in the HTTP is >send out by client to server to indidcate which charset the client could >handle. > >Somehow the "Accept-Charset" get put into HTML 4.0 with funny statement- > > > accept-charset = charset list [CI] > > This attribute specifies the list of character encodings for input > data that must be accepted by the server processing this form. The > > value is a space- and/or comma-delimited list of charset values. > The server must interpret this list as an exclusive-or list, i.e., the > > server must be able to accept any single character encoding per > entity received. > > > > The default value for this attribute is the reserved string > "UNKNOWN". User agents may interpret this value as the character > > encoding that was used to transmit the document containing this > FORM element. > > >The reason I say it is a "funny statment" is because while the client can tell >the server what charset it could accept, it is not reasonable for a form >(which >may site in site A, B, or C) to tell the client what charset the CGI >(which may >located in site D) could accept. Also, it should be the Client software to >interpreet the list here, how can the server interprete that list? Since there >are no word mention about the client, client could ignore this field and still >implement the spec. > >The "accept" really mean what the SERVER could accept here (in the case of >HTML >form, not HTTP 1.1). Therefore, it does not mean the browser have to >reject the >user's input since the client may accept that input while the server don't. > >George Spafford wrote: > > > With forms, there is the accept-charset and I am trying to understand its > > functionality a bit more. I have a situation where users will need to > > enter data into a database that can *only* handle ISO-8859-1 > > characters. In respect to accept-charset, if a Japanese user is viewing a > > site in shift-jis and goes to enter data, what will the data entry and > > submit behavior be if accpet-charset is set to iso-8859-1? I must > > apologize for asking this, but I'm in a crunch right now and don't have > > time to do the simulation. I'm hoping I can leverage someone else's > > experience. > > > > Can the shift-jis user enter data at all? What encoding will the browser > > use assuming the rest of the page is in shift-jis and accept-charset is set > > to iso-8859-1? The underlying picture is that I want users of other > > languages to still be able to enter data into this database via am HTML > > form provided they can read/write English while the rest of the page is > > still in their native tongue. Ideally, the entry would not bomb when they > > hit submit but instead either shift the browser to 8859-1 or alert the user > > that we can't handle the input. > > > > I can't revise the database at this time and am trying to come up with some > > workarounds. Longer term, we will change the database structures and all > > the collateral scripts. > > > > Any thoughts? > >In any way, I think this is a wrong thing to do with your particular problem. >If you care about form validation. Use JavaScript OnChange handler to scan >your >data, and prompt the user if any text you don't want to see is there. For >example, you should prompt the user if s/he type in A-Za-z for a telephone >field. > > > > > > > > --G-- > >--------------CD1AB6608350DF6C7ADD0107 >Content-Type: text/html; charset=us-ascii >Content-Transfer-Encoding: 7bit > ><!doctype html public "-//w3c//dtd html 4.0 transitional//en"> >The name "accept-charset" itself is very misleading. The origion of the >name "Accept-Charset" is from HTTP 1.1 protocol. The Accept-Charset in the >HTTP is send out by client to server to indidcate which charset the client >could handle. > >Somehow the "Accept-Charset" get put into HTML 4.0 with funny statement- >> >>accept-charset = charset list [CI] >> This attribute specifies the list of character encodings for input >> data that must be accepted by the server processing this form. The >> value is a space- and/or comma-delimited list of charset values. >> The server must interpret this list as an exclusive-or list, i.e., the >> server must be able to accept any single character encoding per >> entity received. >> >> The default value for this attribute is the reserved string >> "UNKNOWN". User agents may interpret this value as the character >> encoding that was used to transmit the document containing this >> FORM element. >The reason I say it is a "funny statment" is because while the client can >tell the server what charset it could accept, it is not reasonable for a >form (which may site in site A, B, or C) to tell the client what charset >the CGI (which may located in site D) could accept. Also, it should be the >Client software to interpreet the list here, how can the server interprete >that list? Since there are no word mention about the client, client could >ignore this field and still implement the spec. > >The "accept" really mean what the SERVER could accept here (in the case of >HTML form, not HTTP 1.1). Therefore, it does not mean the browser have to >reject the user's input since the client may accept that input while the >server don't. > >George Spafford wrote: >>With forms, there is the accept-charset and I am trying to understand its >>functionality a bit more. I have a situation where users will need to >>enter data into a database that can *only* handle ISO-8859-1 >>characters. In respect to accept-charset, if a Japanese user is viewing a >>site in shift-jis and goes to enter data, what will the data entry and >>submit behavior be if accpet-charset is set to iso-8859-1? I must >>apologize for asking this, but I'm in a crunch right now and don't have >>time to do the simulation. I'm hoping I can leverage someone else's >>experience. >> >>Can the shift-jis user enter data at all? What encoding will the browser >>use assuming the rest of the page is in shift-jis and accept-charset is set >>to iso-8859-1? The underlying picture is that I want users of other >>languages to still be able to enter data into this database via am HTML >>form provided they can read/write English while the rest of the page is >>still in their native tongue. Ideally, the entry would not bomb when they >>hit submit but instead either shift the browser to 8859-1 or alert the user >>that we can't handle the input. >> >>I can't revise the database at this time and am trying to come up with some >>workarounds. Longer term, we will change the database structures and all >>the collateral scripts. >> >>Any thoughts? >In any way, I think this is a wrong thing to do with your particular >problem. If you care about form validation. Use JavaScript OnChange >handler to scan your data, and prompt the user if any text you don't want >to see is there. For example, you should prompt the user if s/he type in >A-Za-z for a telephone field. > >> >> >>--G-- > >--------------CD1AB6608350DF6C7ADD0107-- George Spafford Director of Development Lionbridge Technologies 950 Winter Street, Suite 2410 Waltham, MA 02451-1291 Telephone: 781-434-6111 (direct) Operator: 781-895-9889 x6111 Facsimile: 781-890-3122 eFAX: 847-574-0658
Received on Thursday, 16 September 1999 08:17:19 UTC