W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2001

Convering directly the chinese character encoding html to wellformed xml??

From: Surfbird Fang <surfbird_fang@hotmail.com>
Date: Sun, 30 Sep 2001 15:32:42 +0000
To: html-tidy@w3.org
Message-ID: <F1341sceBhVgFwK0kNj0000ccc7@hotmail.com>
Hi,all

I'am new to HTML-Tidy.


Following a friend Tony Thompson's advice, I got the JTidy.Because JTidy 
don't
support the Chinese-simple character encodings,so I use command line like 
this:

java -jar Tidy.jar -raw -asxml -m mine.html

Although it seems to work for everybody, but still something trouble. The
&nbsp entity is parsed with '?' (the HEX code is #A030 ).

I spend lots of time testing and thinking, at last, I decide substituting "
" for &nbsp with jakarta-oro, then converting html to wellformed xml with
JTidy.

That's great! It's working sucessful.

But, as a side note, could the JTidy or tidy can convering directly the
chinese character encoding html to wellformed xml?? :)

Thanks a lot.
Surfbird



_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
Received on Sunday, 30 September 2001 11:33:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:46 GMT