Convering directly the chinese character encoding html to wellformed xml??

Hi,all

I'am new to HTML-Tidy.


Following a friend Tony Thompson's advice, I got the JTidy.Because JTidy 
don't
support the Chinese-simple character encodings,so I use command line like 
this:

java -jar Tidy.jar -raw -asxml -m mine.html

Although it seems to work for everybody, but still something trouble. The
&nbsp entity is parsed with '?' (the HEX code is #A030 ).

I spend lots of time testing and thinking, at last, I decide substituting "
" for &nbsp with jakarta-oro, then converting html to wellformed xml with
JTidy.

That's great! It's working sucessful.

But, as a side note, could the JTidy or tidy can convering directly the
chinese character encoding html to wellformed xml?? :)

Thanks a lot.
Surfbird



_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp

Received on Sunday, 30 September 2001 11:33:14 UTC