- From: Andy Quick <ac.quick@sympatico.ca>
- Date: Sat, 3 Jun 2000 16:36:12 -0400
- To: <html-tidy@w3.org>
I assume that you mean the character 0x0D (ie. '\r') when you say "^M" because tidy processes "^M" like text. The line end character for HTML/XML is 0x0A ('\n'). Tidy strips out control characters (other than '\t' and '\n') from the input stream. There is no option to treat '\r' like white space or line-end. I suggest that you either preprocess your HTML, or put a hack in StreamInImpl.readChar. Andy Quick ----- Original Message ----- From: Bernice Maslan <Bernice.Maslan@activeindexing.com> To: <html-tidy@w3.org> Sent: May 22, 2000 5:14 PM Subject: JTidy new line processing > Hello, > > I am running the Java version of HtmlTidy. When the Html input looks > like the one below , Tidy replaces the ^M with nothing, resulting in two > separate words being combined (see Tidy output below also). I do not > know what product was used to create the offending Html. I tried > setting Word2000 and Clean to yes, but there was no change. Is there > anything I can configure to make Tidy substitute a space for the ^M? > > Thanks, > Bernice
Received on Saturday, 3 June 2000 16:56:52 UTC