How to extract HREF tabgs

Hi JTidy Gurus,

I am using JTidy API to validate HTML content. I am planning to build a
link checker utility using JTidy which scans for HREF links in a given
document and checks if the links are broken or not.

I am looking for a way to extract the ANCHOR ( HREF ) tags. I should be
able to pass the URL of a file and the code should return me the list of
HREF tag values. If possible it should also check if each of those links
are broken or not.

I know that there are plenty of tools available for "Link Checker"
functionality but I want to build my own and tailor/customize according
to the requirements of the users.

Would appreciate if some one can help me with an example ?

With many thanks,


Received on Thursday, 26 June 2003 07:19:48 UTC