RE: [w3 i18n geo] Q&A: Setting Encoding in Web Authoring Applica tion s

Hi Phil,

I have a comment on Helios TextPad. If relevant to this FAQ, I would inform
the reader about the fact that the Unicode support is minimal and restricted
to Latin-1 Supplement characters in TextPad.

/François


> -----Original Message-----
> From: Arko, Phil [mailto:phil.arko@scr.siemens.com] 
> Sent: Friday, October 24, 2003 8:33 PM
> To: 'public-i18n-geo@w3.org'
> Subject: [w3 i18n geo] Q&A: Setting Encoding in Web Authoring 
> Application s
> 
> 
> 
> Greetings all!
> 
> Below is the Q&A about setting encoding in various web 
> authoring applications. Your feedback is appreciated.
> 
> Thanks,
> 
> Phil Arko
> Sr. Human Factors Engineer
> Siemens Corporate Research
> User Interface Design Center
> 
> 
> 
> ==============================================
> SETTING ENCODING IN WEB AUTHORING APPLICATIONS 
> ==============================================
> 
> 
> QUESTION
> 
> How do I set character encoding in my web authoring 
> application? [??? or: "Where is the feature hidden in my 
> application?" ???]
> 
> 
> 
> BACKGROUND
> 
> Content on the web can be authored using a variety of 
> software applications. Even within a single site, the content 
> may have been created using multiple authoring tools. For 
> example, a website that was created using Macromedia 
> Dreamweaver might also include a page created using Microsoft 
> Access' data access page feature, as well as a dynamic Flash 
> movie that allows for language selection. In order for all of 
> these files to properly serve the correct text, they need to 
> be properly encoded.
> 
> This article is not meant to be a tutorial on defining and 
> using character encoding within the web authoring 
> applications, but rather to identify where some of the key 
> functionality exists. This is not a complete listing of 
> software, but rather a collection of some of the more popular 
> web authoring applications in use
> 
> As software evolves, it is possible that the location of the 
> functionality may change. In addition, specific options of 
> character encodings may vary depending on the user's 
> installation version and location, and so these are not 
> discussed in detail for each application. For more detailed 
> information, refer to the specific application's help content 
> or user manuals. Common keywords for searches include 
> Character Encoding, Internationalization, Multilingual, 
> Unicode, and UTF.
> 
> There are two main points to remember when creating properly 
> encoded files:
> 
>      1. the markup within the document must properly 
> designate the encoding (such as charset=iso-8859-1 in an 
> XHTML/HTML meta tag, or encoding="UTF-8" in an XML 
> declaration statement).
> 
>      2. the file, itself, must be saved in the proper 
> encoding format (such as UTF-8).
> 
> Most of these applications will save the file in the proper 
> format, but may not input the proper markup within the document.
> 
> Another key element in the markup is the language indicator. 
> Many of the applications listed here combine the encoding and 
> language in the user-selectable options. If the language is 
> not included by the application, it is good practice to also 
> include that in the markup manually. Some applications may 
> acquire the regional settings of your operating system to 
> create a locale tag.
> 
> 
> 
> ANSWER
> 
> [??? Adobe Acrobat ???]
> [??? can't find anything specific yet ???]
> 
> 
> [??? Adobe FrameMaker ???]
> [??? can't find anything specific yet ???]
> 
> 
> Adobe GoLive 5.0 (Mac)
> [??? Newer version?, PC version the same? ???]
> 
> To specify the character encoding for your pages, go to Edit 
> > Preferences > Encodings category.
> 
> 
> [??? Adobe Page Maker ???]
> [??? can't find anything specific yet ???]
> 
> 
> Apple TextEdit
> 
> You will need to input the proper encoding into the 
> XHTML/HTML file. Files are natively saved as UTF-8, so no 
> further action is necessary.
> 
> 
> Macromedia ColdFusion (Windows)
> 
> To properly configure a ColdFusion application, become 
> familiar with the various encoding-related commands and 
> functions (a few of which include "setEncoding," "cfcontent," 
> and the form attribute "enctype").
> 
> 
> Macromedia Dreamweaver MX (Mac & Windows)
> 
> To specify the character encoding for your pages, go to 
> Modify > Page Properties. Select the proper encoding from the 
> "Document Encoding" dropdown menu.
> 
> To specify the character encoding for viewing pages while 
> editing, go to Edit > Preferences > Fonts category 
> (Dreamweaver > Preferences > Fonts category on Mac).
> 
> 
> Macromedia Flash MX (Mac & Windows)
> 
> When efficiently designed, multilingual Flash movies often 
> store the text for each language in separate include files 
> (#include), reducing the time needed to download a flash 
> movie by only sending the selected language data. UTF-8 text 
> can be stored in an include file. The include file should 
> start with "//!-- UTF8" and must be saved in UTF-8 format.
> 
> UTF-8 character notation can also be specified in Flash's 
> ActionScript environment. U+0000 would be written using the 
> escape sequence "\u0000" within the ActionScript code.
> 
> Another setting worth noting is the encoding setting for the 
> end-user's Flash Player. This is defaulted to false 
> (system.useCodepage = false;), which will use UTF-8. There 
> are times when this may have been changed for some special 
> purpose, but must be changed back to "false" before 
> displaying UTF-8 text again by placing the proper 
> ActionScript in the timeline before calling any new text.
> 
> 
> Macromedia HomeSite+
> 
> You need to input the encoding information in the file. You 
> can then go to File > Save As and select the proper encoding 
> using the Encoding dropdown menu.
> 
> There is also an HTML Tidy feature that can check your code 
> as you type. The encoding options are located here: Options > 
> Settings > CodeSweeper category
> > HTML Tidy CodeSweeper subcategory > Macromedia HTML 
> subcategory > Char
> encoding dropdown menu.
> 
> 
> Microsoft Office -- Access, Excel, PowerPoint, and Word 
> (version 2000 for Windows, version X for Mac OS X) [??? NEED 
> TO CHECK IF THIS IS THE SAME IN OFFICE XP ???]
> 
> Microsoft Word is often used to export documents directly to 
> HTML. Increasingly, spreadsheets and presentations (from 
> Excel and PowerPoint,
> respectively) are also being exported to web pages. Exporting 
> database content into web pages has become easier for the 
> desktop user with the addition of data access pages within 
> Microsoft Access (Windows only).
> 
> Select "Tools > Options > General tab > Web Options button > 
> Encoding tab." Select the appropriate selection in the "Save 
> document as" dropdown menu.
> 
> Note: In Access, first open the data access page in design view.
> 
> 
> Microsoft Frontpage 2000 (Windows)
> 
> The encoding options are under "Language (character set)." Go 
> to: Tools > Page Options > Default Font tab. You will notice 
> an option that says "Multilingual (UTF-8)."
> 
> 
> Microsoft Notepad (Windows)
> 
> If you create or edit documents using Notepad, you will need 
> to specify the character encoding and language when you write 
> the markup code. When you save the document, select "File > 
> Save as" and select the proper encoding from the Encoding 
> dropdown list at the bottom. Be aware that there is a known 
> issue with this, which can be fixed with a Pearl script. [??? 
> CAN ANYONE PROVIDE MORE INFO ABOUT THIS ???]
> 
> 
> Helios TextPad
> 
> The proper markup for encoding will need to be entered into 
> the file. When saving the document, the proper file format 
> can be selected here: File > Save As > Encoding dropdown menu.
> 
> 
> W3C Amaya (Mac, Unix, Windows)
> 
> When saving the file, go to File > Save as. Amaya will make 
> sure that the encoding is correct in the xml declaration (for 
> XHTML) and the <meta> statement. Amaya also uses the 
> appropriate encoding ('charset') in the HTTP headers when it 
> saves a document remotely using PUT. Amaya also understands 
> several other encodings when loading a document, but is not 
> able so save in any of these.
> 
> 
> 
> BY THE WAY
> 
> Keep in mind that the end user can select both the encoding 
> to use, as well as the font to use for each encoding [??? CAN 
> THIS BE OVERWRITTEN BY CSS ???]. For example in Microsoft 
> Internet Explorer, the current encoding can be viewed (and 
> revised) by going to the cascading menus under View > 
> Encoding. Note that "Right-To-Left Document" or 
> "Left-To-Right Document" will also appear when it has been set.
> 
> Another option that is selectable by the user for Internet 
> Explorer users is the option to "Always send URLs as UTF-8." 
> This can be found here: Tools > Internet Options > Advanced 
> tab > Browsing category.
> 
> When content is ready to be published, it is good practice to 
> also validate your content using the W3 validation tool 
[http://validator.w3.org/ ].



LINKS

Hints & Tips: Character Encodings
http://www.w3.org/International/O-charset.html

Unicode Enabled Products http://www.unicode.org/onlinedat/products.html

Encoding Forms
http://www.unicode.org/standard/principles.html#Encoding_Forms

Received on Monday, 3 November 2003 09:17:55 UTC