W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2015

Re: removing dom nodes

From: Geoff McLane <ubuntu@geoffair.info>
Date: Wed, 15 Jul 2015 13:48:13 +0200
Message-ID: <55A6487D.9040604@geoffair.info>
To: html-tidy@w3.org
I see 'Tidy!' as two products.

There is libTidy which is a library of services exposed
through an API, tidy.h...

Then there is the console app 'tidy', and as Marvin correctly points
out "Removing acceptable DOM nodes is not the task of html-tidy.".
That is this console app tidy... it is only a 'cleaner'...

But Folkert found that libTidy does have an internal "remove
node", "remove attribute", ... services, and simply exposed that
capability as an API extension of the library...

This in no way changes what 'tidy' (the console app) does, and
does well...

But extends the use of libTidy for those who link it into their
own app... and would welcome more extensions from anyone
using libTidy... what do you want it to do?

It is a fast html parser, and builds a tree of nodes... that
document tree can be enumerated node by node, and now
you can, if you want, discard an element or an attribute of an
element from that tree... before say writing it to a file/buffer...

Simple... and done...


On 15/07/15 12:11, folkert wrote:
> It is in the github version.
> You're responding to a mail from before the submit to them.
> On Wed, Jul 15, 2015 at 11:54:12AM +0200, Marvin Reimer wrote:
>> I'm sure this is not possible. Look at the source code tree. It is really
>> long ago the last changes got into the old libtidy:
>> http://tidy.cvs.sourceforge.net/viewvc/tidy/tidy/src/
>> You should open a github issue on tidy-html5 and ask them.
>> Marvin
>> 2015-07-15 11:49 GMT+02:00 folkert <folkert@vanheusden.com>:
>>>> So why are you asking for that feature when you fixed it yourself??
>>> I would like it to be in the main distribution.
>>> That's because the product i'm developing would be using the adapted
>>> libtidy (an open source product).
>>>> Source link?
>>>> html tidy on sourceforge is the original one (HTML4). It is no longer
>>>> maintained as far as I know.
>>>> https://github.com/htacg/tidy-html5 is the new actual one, forked from
>>> w3c
>>>> and now htacg.
>>> Right. That's the one with my change.
>>>> I'm not involved in either of one, just a heavy user.
>>> Folkert van Heusden
>>> --
>>> MultiTail is een flexibele tool voor het volgen van logfiles en
>>> uitvoer van commando's. Filteren, van kleur voorzien, mergen,
>>> 'diff-view', etc. http://www.vanheusden.com/multitail/
>>> ----------------------------------------------------------------------
>>> Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
> Folkert van Heusden
Received on Wednesday, 15 July 2015 11:48:44 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 15 July 2015 11:48:47 UTC