- From: Reza Ferrydiansyah <ref11+@pitt.edu>
- Date: Thu, 20 Nov 2003 19:07:57 -0500
- To: html-tidy@w3.org
I have had some problems with JTidy's (and I believe Tidy's in general) handling of forms inside of table: For example the html snippet of: Form testing <table> <form action='pageA'> <tr><td><form>Input 1 Form 1: <input name='v'/></td><td>Input 2 Form 1: <input name='w'/></form><form action='pageB'>Input 1 Form 2: <input name='x'></td></tr> <tr><td>Input 2 Form 2: <input name='y'></td></tr></table>Input 3 Form 2: <input name='z'></form> will be turned into Form testing <form action='pageA'></form> <table> <tr> <td> <form>Input 1 Form 1: <input name='v' /></form> </td> <td>Input 2 Form 1: <input name='w' /> <form action='pageB'>Input 1 Form 2: <input name='x' /></form> </td> </tr> <tr> <td>Input 2 Form 2: <input name='y' /></td> </tr> </table> Input 3 Form 2: <input name='z' /> Hello World Notice that input w, y, z are now input without a form. I have tried using another HTML cleaner, TagSoup (unfortunately, still a lot of errors) which returns: <form enctype = "application/x-www-form-urlencoded" method = "get" action = "pageA"> <table><tbody><tr><td colspan = "1" rowspan = "1"> <form enctype = "application/x-www-form-urlencoded" method = "get">Input 1 Form 1: <input type = "text" name = "v"></input></form></td> <td colspan = "1" rowspan = "1">Input 2 Form 1: <input type = "text" name = "w"></input></td></tr></tbody> </table> </form> <form enctype = "application/x-www-form-urlencoded" method = "get" action = "pageB"> Input 1 Form 2: <input type = "text" name = "x"></input> <table><tbody><tr><td colspan = "1" rowspan = "1">Input 2 Form 2: <input type = "text" name = "y"></input></td></tr></tbody></table> Input 3 Form 2: <input type = "text" name = "z"></input> </form> In here, all inputs are inside a form. although the input v is inside a form which is inside a form. The way I see it, tidy usually deletes erroneous tags. Therefore <table> <form><tr><td>...</form> ... will be changed to <form> <table><tr><td>...</form>... because a form cannot be inside a table. It is then corrected to <form> </form> <table><tr><td>... While in tagsoup <table> <form><tr><td>...</form> ... is changed into <table> </table> <form><tr><td>...</form> ... and then <table> </table> <form><table><tr><td>...</form> ... in most cases I think the tag soup version is much more preferable than the tidy version. Anybody know how to do this with jtidy? Relevantly, one of the task I am doing currently is cleaning up HTML code from popular websites. Yahoo, Lycos all have trouble with their form in the form of <form> tags located under <table> tags (<table><form><tr><td>...) So I am currently having a lot of trouble with this... -- Reza Ferrydiansyah
Received on Thursday, 20 November 2003 19:12:46 UTC