W3C home > Mailing lists > Public > www-international@w3.org > April to June 2000

Re: Encoding designation in non-HTML sites

From: Addison Phillips <AddisonP@flashcom.net>
Date: Thu, 13 Apr 2000 12:04:05 +0900
Message-Id: <>
To: www-international@w3.org
Oh heck! I am subscribed. That's how I obtained the message... but my 
mailer when I'm out of the office uses this funky Java application and bad
things sometimes happen... Thanks for forwarding my response.

Yesterday was not a good day for me in e-mail land. There are a few minor
errors in the message that you forwarded.

XML's native encoding is Unicode, which means *either* UTF-16 or UTF-8. UTF-16
files require a byte order mark (BOM  0xFEFF) to distinguish between UTF16LE
and UTF16BE (or you can tag them... )

As I said somewhere else, the internal encoding of an XML file parser is
usually UCS-4 or UTF-16, depending on the parser.



-----Original Message-----
From:    Martin J. Duerst duerst@w3.org
Sent:    Wed, 12 Apr 2000 12:34:08 +0900
To:      addison@globalsight.com, stopping@rochester.rr.com
CC:      www-international@w3.org
Subject: Re: Encoding designation in non-HTML sites

Forwarded by the list moderator.
[Addison, your message went to Suzanne, but not to the list
in the first place, because you were not subscribed to the list.]

Regards,   Martin.

At 00/04/11 12:36 -0400, addison@globalsight.com wrote:

 >Hi Suzanne,
 >The script encoding is in the same place in a page that contains
 >Javascript as in a "normal" HTML page.
 >In the end, a "Javascript page" is still in HTML and can have a META tag
 >just like normal HTML. Most such pages, however, either indicate the
 >encoding in the http header (which is a much better place for it) or don't
 >bother to indicate the encoding at all (which is bad, but not a surprise).
 >You can put a META tag in your file still, but this produces less reliable
 >results in pages that are Javascript or Java heavy.
 >XML files are, by default, Unicode encoded (UTF-8, I believe), unless
 >tagged otherwise. An http header may still be used to indicate the
 >character encoding of the file, but the XML parser itself is looking for:
 ><?xml version="1.0" encoding="Big5"?>
 >Addison P. Phillips
 >Senior Globalization Consultant
 >Global Sight Corporation
 >101 Metro Drive, Suite 750
 >San Jose, California 95110 USA
 >(+1) 408.350.3649 - Phone
 >Going global with your web site? Global Sight provides Web-based
 >software solutions that simplify the process, cut costs, and save time.
 >Sent by: www-international-request@w3.org
 >04/11/2000 10:45 AM AST
 >To: "www" <www-international@w3.org>
 >Subject: Encoding designation in non-HTML sites
 >Can someone tell me where the encoding method is indicated in Java
 >script-based web sites? I was just looking through the source of a few
 >sites, and couldn't find any char-set designations.
 >How about XML sites?
 >Suzanne Topping
 >Localization Unlimited
 >(Globalization Process Improvement Consulting and Training)
 >In association with BizWonk (TM)
 >Phone: 716-473-0791
 >Fax: 716-231-2013
 >Email: stopping@rochester.rr.com
 >(Send me an email to join the North East Localization Special Interest
 >Group, an email distribution list which acts as a discussion forum for
 >localization issues.)

Get Visto.com!  Private groups, event calendars, email, and much more.
Visto.com. Life on the Dot.
Check it out @ http://www.visto.com/info
Received on Wednesday, 12 April 2000 23:01:43 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 22:04:17 UTC