W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2005

Chinese characters in BBEdit (again)

From: Chris von Rosenvinge <chris@vingdesign.com>
Date: Wed, 10 Aug 2005 13:52:43 -0400
Message-Id: <p06230921bf1fb467e6c2@[172.16.1.33]>
To: html-tidy@w3.org

Colleagues,
The discussion about Microsoft Word reminds me 
that BBEdit is in many ways a superior text 
editor, at least on the Mac.

Unfortunately BBEdit doesn't play well with Tidy 
either. BBEdit has a built-in Tidy feature under 
the Markup menu item. I don't see how to use a 
configuration file with it. The BBTidy plug-in 
(1.0b10-01 Dec 02,  W3C 1998-2002, Terry Teague 
1998-2004) in BBEdit 8.2.2, on the other hand, 
has the following two problems:

1. The BBTidy plug-in has a bug that renders the 
tidied file in Chinese characters. As a 
workaround, I save the document in UTF-16 coding. 
At this point I can reopen the document and it 
reads normally in BBEdit, but the Chinese 
characters show up in a browser. I then zap 
non-ASCII characters and resave the file in Latin 
1 coding. It finally reads normally both in 
BBEdit and in a browser. This works as long as 
there are no accented characters, which brings me 
to the second problem.

2. In the config file that I can choose with the BBTidy plug-in, I use

ascii-chars: no
numeric-entities: yes

This leaves alone such items as #160 
(non-breaking space) and #8226 (bullet) as well 
as #8211 (en dash) and #8220 (curly open quote). 
It even knows to convert ndash to #8211. However, 
it turns eacute and #233 into an e with an acute 
accent, which reads OK as a local file opened in 
a browser, but displays incorrectly from a web 
server. Similarly with other accented characters, 
such as U umlaut.

Does anyone know how to avoid these problems?

Thanks!
Received on Wednesday, 10 August 2005 17:52:57 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:55 GMT