[w3 i18n geo] Q&A: Setting Encoding in Web Authoring Application s

Greetings all!

Below is the Q&A about setting encoding in various web authoring
applications. Your feedback is appreciated.

Thanks,

Phil Arko
Sr. Human Factors Engineer
Siemens Corporate Research
User Interface Design Center



==============================================
SETTING ENCODING IN WEB AUTHORING APPLICATIONS
==============================================


QUESTION

How do I set character encoding in my web authoring application?
[??? or: "Where is the feature hidden in my application?" ???]



BACKGROUND

Content on the web can be authored using a variety of software applications.
Even within a single site, the content may have been created using multiple
authoring tools. For example, a website that was created using Macromedia
Dreamweaver might also include a page created using Microsoft Access' data
access page feature, as well as a dynamic Flash movie that allows for
language selection. In order for all of these files to properly serve the
correct text, they need to be properly encoded.

This article is not meant to be a tutorial on defining and using character
encoding within the web authoring applications, but rather to identify where
some of the key functionality exists. This is not a complete listing of
software, but rather a collection of some of the more popular web authoring
applications in use

As software evolves, it is possible that the location of the functionality
may change. In addition, specific options of character encodings may vary
depending on the user's installation version and location, and so these are
not discussed in detail for each application. For more detailed information,
refer to the specific application's help content or user manuals. Common
keywords for searches include Character Encoding, Internationalization,
Multilingual, Unicode, and UTF.

There are two main points to remember when creating properly encoded files:

     1. the markup within the document must properly designate the encoding
(such as charset=iso-8859-1 in an XHTML/HTML meta tag, or encoding="UTF-8"
in an XML declaration statement).

     2. the file, itself, must be saved in the proper encoding format (such
as UTF-8).

Most of these applications will save the file in the proper format, but may
not input the proper markup within the document.

Another key element in the markup is the language indicator. Many of the
applications listed here combine the encoding and language in the
user-selectable options. If the language is not included by the application,
it is good practice to also include that in the markup manually. Some
applications may acquire the regional settings of your operating system to
create a locale tag.



ANSWER

[??? Adobe Acrobat ???]
[??? can't find anything specific yet ???]


[??? Adobe FrameMaker ???]
[??? can't find anything specific yet ???]


Adobe GoLive 5.0 (Mac)
[??? Newer version?, PC version the same? ???]

To specify the character encoding for your pages, go to Edit > Preferences >
Encodings category.


[??? Adobe Page Maker ???]
[??? can't find anything specific yet ???]


Apple TextEdit

You will need to input the proper encoding into the XHTML/HTML file. Files
are natively saved as UTF-8, so no further action is necessary.


Macromedia ColdFusion (Windows)

To properly configure a ColdFusion application, become familiar with the
various encoding-related commands and functions (a few of which include
"setEncoding," "cfcontent," and the form attribute "enctype").


Macromedia Dreamweaver MX (Mac & Windows)

To specify the character encoding for your pages, go to Modify > Page
Properties. Select the proper encoding from the "Document Encoding" dropdown
menu.

To specify the character encoding for viewing pages while editing, go to
Edit > Preferences > Fonts category (Dreamweaver > Preferences > Fonts
category on Mac).


Macromedia Flash MX (Mac & Windows)

When efficiently designed, multilingual Flash movies often store the text
for each language in separate include files (#include), reducing the time
needed to download a flash movie by only sending the selected language data.
UTF-8 text can be stored in an include file. The include file should start
with "//!-- UTF8" and must be saved in UTF-8 format.

UTF-8 character notation can also be specified in Flash's ActionScript
environment. U+0000 would be written using the escape sequence "\u0000"
within the ActionScript code.

Another setting worth noting is the encoding setting for the end-user's
Flash Player. This is defaulted to false (system.useCodepage = false;),
which will use UTF-8. There are times when this may have been changed for
some special purpose, but must be changed back to "false" before displaying
UTF-8 text again by placing the proper ActionScript in the timeline before
calling any new text.


Macromedia HomeSite+

You need to input the encoding information in the file. You can then go to
File > Save As and select the proper encoding using the Encoding dropdown
menu.

There is also an HTML Tidy feature that can check your code as you type. The
encoding options are located here: Options > Settings > CodeSweeper category
> HTML Tidy CodeSweeper subcategory > Macromedia HTML subcategory > Char
encoding dropdown menu.


Microsoft Office -- Access, Excel, PowerPoint, and Word (version 2000 for
Windows, version X for Mac OS X)
[??? NEED TO CHECK IF THIS IS THE SAME IN OFFICE XP ???]

Microsoft Word is often used to export documents directly to HTML.
Increasingly, spreadsheets and presentations (from Excel and PowerPoint,
respectively) are also being exported to web pages. Exporting database
content into web pages has become easier for the desktop user with the
addition of data access pages within Microsoft Access (Windows only).

Select "Tools > Options > General tab > Web Options button > Encoding tab."
Select the appropriate selection in the "Save document as" dropdown menu.

Note: In Access, first open the data access page in design view.


Microsoft Frontpage 2000 (Windows)

The encoding options are under "Language (character set)." Go to: Tools >
Page Options > Default Font tab. You will notice an option that says
"Multilingual (UTF-8)."


Microsoft Notepad (Windows)

If you create or edit documents using Notepad, you will need to specify the
character encoding and language when you write the markup code. When you
save the document, select "File > Save as" and select the proper encoding
from the Encoding dropdown list at the bottom. Be aware that there is a
known issue with this, which can be fixed with a Pearl script. [??? CAN
ANYONE PROVIDE MORE INFO ABOUT THIS ???]


Helios TextPad

The proper markup for encoding will need to be entered into the file. When
saving the document, the proper file format can be selected here: File >
Save As > Encoding dropdown menu.


W3C Amaya (Mac, Unix, Windows)

When saving the file, go to File > Save as. Amaya will make sure that the
encoding is correct in the xml declaration (for XHTML) and the <meta>
statement. Amaya also uses the appropriate encoding ('charset') in the HTTP
headers when it saves a document remotely using PUT. Amaya also understands
several other encodings when loading a document, but is not able so save in
any of these.



BY THE WAY

Keep in mind that the end user can select both the encoding to use, as well
as the font to use for each encoding [??? CAN THIS BE OVERWRITTEN BY CSS
???]. For example in Microsoft Internet Explorer, the current encoding can
be viewed (and revised) by going to the cascading menus under View >
Encoding. Note that "Right-To-Left Document" or "Left-To-Right Document"
will also appear when it has been set.

Another option that is selectable by the user for Internet Explorer users is
the option to "Always send URLs as UTF-8." This can be found here: Tools >
Internet Options > Advanced tab > Browsing category.

When content is ready to be published, it is good practice to also validate
your content using the W3 validation tool [http://validator.w3.org/ ].



LINKS

Hints & Tips: Character Encodings
http://www.w3.org/International/O-charset.html

Unicode Enabled Products
http://www.unicode.org/onlinedat/products.html

Encoding Forms
http://www.unicode.org/standard/principles.html#Encoding_Forms

Received on Friday, 24 October 2003 14:32:57 UTC