W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2003

Re: Using htmltidy to parse: getting the "body" of a node

From: John Coggeshall <john@coggeshall.org>
Date: 02 Oct 2003 13:26:18 -0400
To: jany.quintard@free.fr
Cc: joe user <palehaole@yahoo.com>, html-tidy@w3.org
Message-Id: <1065115578.21063.15.camel@coogle.localdomain>

> > I am trying to use Tidy to do its magic on (possibly
> > broken) html files, for input to other layers of
> > processing in C.  I need to get access to the body of
> > stuff.
> > 
> > For example, in this block:
> > 
> > <p>This is some text.</p>

iirc, this can be done via the tidyNodeGetText() function. For examples
check out the PHP bindings for libtidy:

http://cvs.php.net/co.php/pecl/tidy/tidy.c?login=2&r=1.15

John
-- 
-~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~-
John Coggeshall                             http://www.coggeshall.org/
john at coggeshall dot org                 

The PHP Developer's Handbook
The definitive PHP5 developer's guide     http://www.php-handbook.com/
-~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~--~=~-
Received on Thursday, 2 October 2003 13:26:08 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:54 UTC