using JTidy with all character sets

Hi,
I'm using JTidy to parse web pages from any language and character set. But
I have run into problems. When run on http://www.number.ne.jp/ I get errors
like:
line 177 column 167 - Warning: unescaped & or unknown entity "&#36628"
line 177 column 207 - Warning: unescaped & or unknown entity "&#34892"
line 177 column 223 - Warning: unescaped & or unknown entity "&#33258"
line 178 column 147 - Warning: unescaped & or unknown entity "&#36914"
line 178 column 163 - Warning: unescaped & or unknown entity "&#35542"
line 178 column 193 - Warning: unescaped & or unknown entity "&#34276"
line 178 column 209 - Warning: unescaped & or unknown entity "&#65295"
line 178 column 249 - Warning: unescaped & or unknown entity "&#65295"
line 178 column 281 - Warning: unescaped & or unknown entity "&#38742"

These are actual chars in Japanese. How do I set JTidy to ignore all content
except HTML/XHTML tags?

David Hinshelwood
CRL NMSU
Tel: (505) 646 3342 (office)
       (505) 645 5537 (home)

Received on Tuesday, 18 July 2000 15:29:03 UTC