W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2006

Re: input-xml

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Fri, 10 Feb 2006 21:27:03 +0100
To: Rasmus Lerdorf <rasmus@lerdorf.com>
Cc: html-tidy@w3.org
Message-ID: <elspu1tu1tss5ifkq2gmehfn50g901605i@hive.bjoern.hoehrmann.de>

* Rasmus Lerdorf wrote:
>I was hoping for something a bit prettier.  If I run it through the HTML 
>parser instead by not setting input-xml but otherwise use the same 
>options, I get:

Hi Rasmus, thanks for your comment. Your code looks good, I'm afraid
the problem here is that Tidy's XML pretty printing capabilities are
not that advanced. I don't think there are configuration options that
give you better results. The main problem here is that Tidy does not
know which of your elements are like <span>, like <div> or like <pre>
and there is currently no way to tell Tidy that. In some cases Tidy
might in fact remove or add too many spaces, so in general I would
recommend another tool for this.

There is a workaround that might work for you though, you can use the
HTML input mode and declare all elements you use in advance, assuming
they don't clash with HTML elements, so, with the command line tool,

  % tidy --indent yes --indent-spaces 4 --markup yes --wrap 4096 \
     --new-blocklevel-tags "top level2 level3" --show-body-only yes

  <top>Test 1<level2>Test 2<level3>Test 3</level3></level2></top>

gives

  <top>
      Test 1
      <level2>
          Test 2
          <level3>
              Test 3
          </level3>
      </level2>
  </top>

Or, using --new-inline-tags instead

  <top>Test 1<level2>Test 2<level3>Test 3</level3></level2></top>

or if you use --new-blocklevel-tags top --new-inline-tags level2 level3

  <top>
      Test 1<level2>Test 2<level3>Test 3</level3></level2>
  </top>

which might be closer to what you are looking for. See the quick
reference http://tidy.sf.net/docs/quickref.html#new-blocklevel-tags for
more information about these options. Also note that things would be
better if you had less character data, like

  % tidy -xml -i
  <foo><bar/></foo>

would produce

  <foo>
    <bar />
  </foo>

Tools like http://www.kitebird.com/software/xmlformat/ would be more
suitable here in any case though.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Friday, 10 February 2006 20:26:09 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:56 GMT