The application I am writing runs tidy and then programmatically extracts the hrefs from the resulting tidied document and spiders those hrefs. The spider was not replacing & with & before sending the http request. I will do the replacement inside the spider library. I just assumed that urls within hrefs would be exactly the same after running jtidy. -jason |--------+--------------------------> | | "Fred Bone" | | | <fred.bone@dial.| | | pipex.com> | | | | | | 05/23/2000 01:56| | | PM | | | | |--------+--------------------------> >----------------------------------------------------------------------------| | | | To: Jason Horman/Lycos@Lycos | | cc: html-tidy@w3.org | | Subject: Re: bug | >----------------------------------------------------------------------------| On 23 May 2000, at 13:48, jhorman@lycos-inc.com wrote: > No, the proper URL querystring should be & delimited not & delimited. > QueryString parsers such as asp would interpret the url as having a query param > called "amp;BV_EngineID", thereby breaking the cgi apps which these urls point > to. Your right in that it is the proper xml encoding but not html. The querystring sent by a browser will have had the "&" interpreted into "&".Received on Tuesday, 23 May 2000 14:26:51 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:37:48 GMT