Re: JTidy new line processing

I assume that you mean the character 0x0D (ie. '\r') when
you say "^M" because tidy processes "^M" like text.
The line end character for HTML/XML is 0x0A ('\n').  Tidy
strips out control characters (other than '\t' and '\n') from
the input stream.  There is no option to treat '\r' like white
space or line-end.

I suggest that you either preprocess your HTML, or put a
hack in StreamInImpl.readChar.

Andy Quick
----- Original Message ----- 
From: Bernice Maslan <Bernice.Maslan@activeindexing.com>
To: <html-tidy@w3.org>
Sent: May 22, 2000 5:14 PM
Subject: JTidy new line processing


> Hello,
> 
> I am running the Java version of HtmlTidy.  When the Html input looks
> like the one below , Tidy replaces the ^M with nothing, resulting in two
> separate words being combined (see Tidy output below also).  I do not
> know what product was used to create the offending Html.  I tried
> setting Word2000 and Clean to yes, but there was no change.  Is there
> anything I can configure to make Tidy substitute a space for the ^M?
> 
> Thanks,
> Bernice

Received on Saturday, 3 June 2000 16:56:52 UTC