W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2012

perl, HTML::Tidy, clean ...

From: Andrej <andrej.groups@gmail.com>
Date: Mon, 11 Jun 2012 12:16:14 +1200
Message-ID: <CACMx3pNnRRemsBjEY6KSVsoWhKdp2+pinSGzqOoPX-n6fADH7g@mail.gmail.com>
To: html-tidy@w3.org
Hi,

I'm trying to clean up apple wiki markup.

I'm using the following options w/ tidy ...
       tidy_mark => 0,
       output_encoding => 'utf8',
       input_encoding => 'utf8',
       drop_proprietary_attributes => 1,
       output_xhtml => 1,
       clean => 1,
       hide_endtags => 1,

For some wiki entries I get
HTML parser error : Unexpected end tag : p
Lorem ipsum.</li> </ul></p>

Imagine the output from the XML::RPC call to look like:
<div>
<p class="MsoNormal">
<ul>
<li>Lorem.</li>
<li>ipsume.</li>
</ul>
</p>
</div>


What option do I need to set to make tidy carry on w/ processing (stripping the
erroneous p's)  rather than erroring out?


Cheers,
Andrej
Received on Monday, 11 June 2012 00:16:42 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 11 June 2012 00:16:48 GMT