W3C home > Mailing lists > Public > html-tidy@w3.org > April to June 2003

How to extract HREF tabgs

From: Srinivas Kasam <Srinivas.Kasam@Oracle.com>
Date: Thu, 26 Jun 2003 16:44:47 +0530
Message-ID: <3EFAD5A7.48341BBC@Oracle.com>
To: html-tidy@w3.org

Hi JTidy Gurus,

I am using JTidy API to validate HTML content. I am planning to build a
link checker utility using JTidy which scans for HREF links in a given
document and checks if the links are broken or not.

I am looking for a way to extract the ANCHOR ( HREF ) tags. I should be
able to pass the URL of a file and the code should return me the list of
HREF tag values. If possible it should also check if each of those links
are broken or not.

I know that there are plenty of tools available for "Link Checker"
functionality but I want to build my own and tailor/customize according
to the requirements of the users.

Would appreciate if some one can help me with an example ?

With many thanks,

Srini.

Received on Thursday, 26 June 2003 07:19:48 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:54 GMT