- From: Ivo Pletikosic <ivo@benetech.org>
- Date: Tue, 20 May 2003 10:47:29 -0700
- To: "'Bjoern Hoehrmann'" <derhoermi@gmx.net>
- Cc: html-tidy@w3.org
Thanks for the clarification/suggestion. Sounds like the current build of Tidy classic has enough for what I need. It's a useful tool for plenty things that I will look into the source and see how I can patch the behavior like you suggest as well. Thank you and regards, Ivo -----Original Message----- From: Bjoern Hoehrmann [mailto:derhoermi@gmx.net] Sent: Monday, May 19, 2003 5:57 PM To: Ivo Pletikosic Cc: html-tidy@w3.org Subject: Re: Feature or support request? prevent tidy from stripping the s ingl e whitespace that follows an endtag * Ivo Pletikosic wrote: >is output as: >-- ><?xml version="1.0" encoding="iso-8859-1" standalone="yes"?> ><dtbook3> > <book> > <bodymatter>hello, > <em>how</em>are you > <strong>doing</strong>today.</bodymatter> > </book> ></dtbook3> >-- Tidy must not remove the whitespace between "how" and "are you" and between "doing" and "today." TidyClassic got it right... In the unbroken version (or Tidy 04 August 2000) you would get <dtbook> <book> <bodymatter>hello, <em>how</em> are you <strong>doing</strong> today.</bodymatter> </book> </dtbook> Not nice but doesn't do harm to most documents. Is that what you are asking for? If not, how should the output look like? IMHO, the most reasonable behaivour for Tidy would be to consider elements having text node siblings or ancestors with text nodes siblings to be inline elements and all other elements blocklevel elements. The output would then look similar to <?xml version="1.0" encoding="iso-8859-1" standalone="yes"?> <dtbook> <book> <bodymatter> hello, <em>how</em> are you <strong>doing</strong> today. </bodymatter> </book> </dtbook> or ... <bodymatter>hello, <em>how</em> are you <strong>doing</strong> today.</bodymatter> ... depending on whether Tidy wraps after/before <bodymatter>. For more sophisticated solutions Tidy would need to know what elements constitute blocks and what elements are to be considered inline elements. >Is there a way to configure tidy to treat the XML tags as if it were HTML? >or is this spacing behavior limited to files identified and markedup as >HTML? Depends on what spacing behaivour you ask for. It would also be possible to introduce a config option that identifies certain elements as those from the XHTML namespace. However, yes, the XML pretty printer is pretty simplistic and as noted above obviously broken. It would be possible to improve it, but I don't think I will do so. Patches welcome :-) In general, if you already have well-formed XML there is a number of better tools for you out there.
Received on Tuesday, 20 May 2003 13:47:32 UTC