> The application I am writing runs tidy and then programmatically extracts the > hrefs from the resulting tidied document and spiders those hrefs. The > spider was > not replacing & with & before sending the http request. I will do the > replacement inside the spider library. I just assumed that urls within hrefs > would be exactly the same after running jtidy. The best course for such dilemmas is to run the HTML in question through a validator. You can use the W3C's validator or (my favorite) www.htmlhelp.com/tools/validator ... in either case, it would tell you that naked ampersands (i.e., not escaped as &) are not OK, either in URLs or anywhere else. A validator is analogous a final spelling checker. It's good to run a document through a validator, even if it's been "Tidied". Tidy is good, but the validator is the ultimate test. /JelksReceived on Tuesday, 23 May 2000 15:03:08 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:37:48 GMT