W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2003

Re: FW: Why is space removed?

From: Charles Reitzel <creitzel@rcn.com>
Date: Wed, 22 Jan 2003 15:20:21 -0500
Message-Id: <4.3.2.7.2.20030122151401.02e87e60@pop.rcn.com>
To: Michael Goldberg <MGoldberg@yet2.com>
Cc: "'html-tidy@w3.org'" <html-tidy@w3.org>

This is an OK place for JTidy questions.  One thing to keep in mind, even 
when parsing generic XML, Tidy (including, AFAIK, JTidy) will still 
recognize HTML elements and treat them accordingly.  There have been some 
improvements in the whitespace handling for inline elements in C 
Tidy.  JTidy development has been inactive for some time.

I believe they are waiting for an "official" release from us (C Tidy).  But 
we don't do those.  Besides internals of parsing, the biggest 
change/difference is in option handling.  JTidy is still more advanced in 
terms of I/O abstraction.

take it easy,
Charlie


At 08:31 AM 1/22/2003 -0800, Michael Goldberg wrote:

>Reposting, as I still have this issue, and would love to see a response.  Is
>there a separate forum to discuss JTidy?  I apologize in advance if this is
>not the correct place to discuss JTidy matters.
>
>Sincerely,
>Michael S. Goldberg
>
>
>All,
>
>I am using JTidy to parse an input string into an Element.  Later, when I
>serialize the Element back into a String, I lose a space.  Is there an
>option I can specify so as not to lose the space?
>
>In the sample code provided below, I lose the space just prior to the letter
>"C".  I'm pretty sure the problem is introduced in the first part (the
>parsing) rather than the latter part (the serializing).
>
>Interestingly, if I change my input to "<html><title/><body>A <b>B</b>
>C</body></html>" and use setXmlTags( false ), the space is not lost.  Why?
>
>Sincerely,
>Michael S. Goldberg
>
>Sample Code:
>
>Tidy tidyInstance = new Tidy();
>tidyInstance.setXmlTags( true );
>
>String inString = "<temp>A <b>B</b> C</temp>";
>ByteArrayInputStream in = new ByteArrayInputStream( inString.getBytes() );
>
>// Use JTidy to parse the input into a Document
>Document document = tidyInstance.parseDOM( in, null );
>Element documentElement = document.getDocumentElement();
>
>// Setup an output format for the makeSerializer() call below
>OutputFormat outputFormat = new OutputFormat();
>outputFormat.setOmitXMLDeclaration( true );
>
>// Set up a byteArrayOutputStream, the output of the makeSerializer() call
>below
>ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream();
>
>// Get a SerializerFactory for the makeSerializer() call below
>SerializerFactory xmlSerializerFactory =
>SerializerFactory.getSerializerFactory( Method.XML );
>
>// Use serializer to convert the element into a String
>XMLSerializer xmlSerializer = (XMLSerializer)
>xmlSerializerFactory.makeSerializer( new OutputStreamWriter(
>byteArrayOutputStream ), outputFormat );
>xmlSerializer.serialize( documentElement );
>
>System.out.println( byteArrayOutputStream );
Received on Wednesday, 22 January 2003 15:13:11 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:53 GMT