- From: Tex Texin <tex@i18nguy.com>
- Date: Tue, 11 Nov 2003 17:56:01 -0500
- To: GEO <public-i18n-geo@w3.org>
from helios support ===================== Hello Tex, Here is a brief explanation of how TextPad currently deals with UTF-8. TextPad does a statistical analysis of files as it opens them, to check if they are UTF-8. Unless a file contains two or more characters of UTF-8 sequences, it must start with the Unicode signature, to be correctly recognised. If there is 32kb of text before any UTF-8 sequences, then the file will not be recognized as UTF-8. We will enhance a future version of TextPad to recognise the XML header "<?xml version="1.0" encoding="UTF-8"?>". Please also see the information on working with Unicode files, which is available from the Help menu as follows: >From the Help menu choose: 1. Help Topics 2. Contents 3. Plus sign next to "How to" 4. Plus sign next to "Work with Files" 5. Unicode Files ============================================== thanks, why don't you follow the encoding in the open dialog? Or allow some other override? It's very frustrating to specify utf-8 on the open and still get told the file needs to be converted to 1252. Also, do you support all of unicode or just a subset of the characters? For example I am working with hebrew and arabic data and it hasnt been working yet. I am on a mailing list with a group of internationalization developers and will be copying this and your next answer to the list. thanks tex ================================================ Hello Tex, On each of the Document class options Preference pages, there is an option to specify UTF-8 as the default encoding, and also to write the Unicode and UTF-8 BOM. In order to edit another language such as Hebrew or Arabic in TextPad, you will need to implement the following procedure to install the appropriate fonts: >From the Start menu choose: 1. Control panel 2. Regional and language options 3. Languages 4. Check the first option on "Supplemental language support" 5. Click Apply/OK It may be necessary to restart your computer. You should now be able to choose an appropriate font and script in TextPad. However, as TextPad odes not have full support for Unicode, please take note of the following extract, taken from the help files: "TextPad automatically detects 16-bit Unicode and UTF-8 encoded characters, when opening files. Unicode characters may be in "little endian" (Intel) or "big endian" (RISC) order, and the order is preserved when a file is saved. Internally, these files are converted to single or double byte characters (DBCS), using the locale corresponding to the font script selected for the document class. For example, if the screen font for the Text document class is MS Mincho, with the script set to Japanese, Unicode characters in *.TXT files will be converted to the corresponding DBCS characters in code page 932. WARNING: This means that it is only possible to edit, without data loss, files containing characters from the implied code page. Other characters will be converted into a system default character (normally "?"), if you confirm that is what you want to do." ====================================== thanks for your answers. I don't want to change my default encoding as not all my files are unicode. I want to be able to select according to the need... When will textpad be fully unicode internally? ====================================== We are currently working on TextPad 5, which will have full Unicode support. It's still fairly early days, so you will need to be patient. -- ------------------------------------------------------------- Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com Xen Master http://www.i18nGuy.com XenCraft http://www.XenCraft.com Making e-Business Work Around the World -------------------------------------------------------------
Received on Tuesday, 11 November 2003 17:59:31 UTC