RE: removing &nbsp

Erik Tan wrote:
> 
> i am trying to converting an html document to xml. In the intial, my
> document conversion was fine until when i used XT parser to parse the
> resulting xml ouput . I have the obtained the following error.
> 
>     xml:154: reference to undefined entity "nbsp"
> 
> 
>    I have following setting in the jtidy-04aug2000r6/src/org/w3c/tidy/
> 
> # sample config file for Java HTML tidy
> indent=auto
> indent-spaces=3
> wrap=72
> markup=yes
> clean=yes
> output-xml=no
> input-xml=no
> show-warnings=yes
> numeric-entities=yes
> quote-marks=yes
> quote-nbsp=no
> quote-ampersand=yes
> break-before-br=no
> uppercase-tags=yes
> uppercase-attributes=yes
> smart-indent=yes
> output-xhtml=yes
> char-encoding=latin1
> 
> I want to know is there a way not by HTMLTidy or that will no prevent
> the &nbsp from printing out to the output file...

Using your config file, I get a character with hex value A0, which is
correct for latin-1 encoding.  I don't know why you're getting " ".
However, if I delete the "quote-nbsp=no" from your config file, I get
" " instead, which may be what you want (we use Tidy to output
XHTML this way and it has been working just fine).

Received on Wednesday, 16 May 2001 13:52:26 UTC