W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2002

Tidy changes all   into ? in my document. :-(

From: Franklen Choi <franklen@pacific.net.hk>
Date: Sat, 19 Jan 2002 02:27:26 -0500 (EST)
Message-ID: <3C49205F.9EAD08E1@pacific.net.hk>
To: html-tidy@w3.org
Dear all

I find a problem with Tidy. Each time when I use Tidy to clean up the
codes of my document, save the document and open it again, I find there
are lots of "?" characters. I have had a hard time in locating the
problem, only to find that this is because Tidy converts all '&nbsp;'
into '?'. If a '&nbsp' is adjacent to a tag, Tidy will also trim the
tag. So I find lots of ?br> or ?p>.

I have a lots of web-pages needing to be cleaned, and these pages were
previously created by an old version of netscape composer, which added
many '&nbsp;' in these documents. However, since the above problem
exists, I must clear up all the "?" after tidy cleans the codes. This is
rather inconvienence.

The following is my configuaration file for Tidy

tidy-mark: yes
markup: yes
wrap: 72
tab-size: 8
indent: auto
indent-spaces: 2
output-xhtml: no
doctype: loose
char-encoding: raw
clean: no
logical-emphasis: yes
keep-time: yes
quote-nbsp: yes

I use the raw option because my documents contain Asian characters. I
have tried to change the option for quote-nbsp to 'no' but in vain. I
use the win32 version of Tidy which is supposed to support Asian
characters (although this command line program is itself unsupported
now, its version date was Sept, 2001). Can anyone suggest how I should
go on? Thank you very much for any reply.

best
Franklen CKS
Hong Kong

p.s. it is very hard to find a good html editor that is integrated with
Tidy and support Asian characters well :-( . I can hardly find one
running either on GNU/Linux or Windoze. I really hope that I can enjoy
the full power of Tidy, Amaya and even the validation service of W3C
(which keeps on telling me that 'your page contains bytes which I cannot
interpret as big5', when this page can be rendered unproblematically in
all browsers I test again, including lynx.)
Received on Wednesday, 23 January 2002 01:39:27 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:51 GMT