W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2000

Re: JTidy new line processing

From: Andy Quick <ac.quick@sympatico.ca>
Date: Sat, 3 Jun 2000 16:36:12 -0400
Message-ID: <003101bfcd9d$f9f05520$6dceacce@quick>
To: <html-tidy@w3.org>
I assume that you mean the character 0x0D (ie. '\r') when
you say "^M" because tidy processes "^M" like text.
The line end character for HTML/XML is 0x0A ('\n').  Tidy
strips out control characters (other than '\t' and '\n') from
the input stream.  There is no option to treat '\r' like white
space or line-end.

I suggest that you either preprocess your HTML, or put a
hack in StreamInImpl.readChar.

Andy Quick
----- Original Message ----- 
From: Bernice Maslan <Bernice.Maslan@activeindexing.com>
To: <html-tidy@w3.org>
Sent: May 22, 2000 5:14 PM
Subject: JTidy new line processing


> Hello,
> 
> I am running the Java version of HtmlTidy.  When the Html input looks
> like the one below , Tidy replaces the ^M with nothing, resulting in two
> separate words being combined (see Tidy output below also).  I do not
> know what product was used to create the offending Html.  I tried
> setting Word2000 and Clean to yes, but there was no change.  Is there
> anything I can configure to make Tidy substitute a space for the ^M?
> 
> Thanks,
> Bernice
Received on Saturday, 3 June 2000 16:56:52 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:43 GMT