- From: Rzepa, Henry <h.rzepa@ic.ac.uk>
- Date: Wed, 14 Jun 2000 18:25:26 +0100
- To: html-tidy@w3.org, g.gkoutos@ic.ac.uk
> 
> >>I am running the Java version of HtmlTidy.  When the Html input looks
> >>like the one below , Tidy replaces the ^M with nothing, resulting in two
> >>separate words being combined (see Tidy output below also).  I do not
> >>know what product was used to create the offending Html. 
We appear to have a simular, but I suspect bug of different origin
When JTidy is called through  its main method i.e
java org.w3c.tidy.Tidy -asxml  file.html
the output is fine.
when it is called through its parser method i.e
Tidy tidy = new Tidy();
          tidy.setMakeClean(true);
//	  tidy.setXmlTags(true);
          tidy.setXHTML(true);
tidy.parse(in, out);
the output has deleted spaces,
i.e 
density was calculatedboth for anion A and B and the most
instead of 
density was calculated both for anion A and B and the most
This space is NOT at the end of line markers, its in the middle.
In a file of say about  10,000 characters, it appears perhaps  50-100
spaces will be deleted. We think its only spaces are lost, and not
other characters.
If  tidy.setXmlTags(true); is uncommented 
then it produces extra line throws
but does not delete gaps.
I might add that other versions of Tidy process the same file with no 
problem.
since calculatedboth  has a different meaning from calculated both,
we consider this a serious problem.
Any comments on the origin of the problem?
Received on Wednesday, 14 June 2000 13:25:30 UTC