- From: Larry Yu <csthyu@comp.polyu.edu.hk>
- Date: Thu, 6 Jul 2000 03:45:31 -0400 (EDT)
- To: html-tidy@w3.org
- Message-ID: <3964387A.B5B6818@comp.polyu.edu.hk>
Hello, I have saved the www.yahoo.com index page and hope to use jtidy to convert it to xml. However I get the following error and the the conversion stop in somewhere. I have used tidy.setMakeClean(true); tidy.setXmlOut(true); in my program now. How can I set values in my program? Thanks your ideas in advance. Larry Error: ======== Tidy (vers 30th April 2000) Parsing "InputStream" line 1 column 295 - Warning: <area> lacks "alt" attribute line 1 column 330 - Warning: <area> lacks "alt" attribute line 1 column 367 - Warning: <area> lacks "alt" attribute line 1 column 405 - Warning: <area> lacks "alt" attribute line 1 column 443 - Warning: <area> lacks "alt" attribute line 1 column 481 - Warning: <area> lacks "alt" attribute line 1 column 637 - Warning: <table> lacks "summary" attribute line 3 column 1 - Warning: <img> lacks "alt" attribute line 4 column 84 - Warning: <table> lacks "summary" attribute line 40 column 58 - Warning: <table> lacks "summary" attribute line 40 column 134 - Warning: <table> lacks "summary" attribute line 40 column 213 - Warning: <table> lacks "summary" attribute line 40 column 306 - Warning: <table> lacks "summary" attribute line 59 column 4 - Warning: <table> lacks "summary" attribute line 59 column 4 - Warning: trimming empty <p> line 59 column 126 - Warning: unescaped & which should be written as & line 60 column 89 - Warning: unescaped & which should be written as & line 63 column 83 - Warning: unescaped & which should be written as & line 76 column 115 - Warning: unescaped & which should be written as & line 78 column 82 - Warning: unescaped & which should be written as & line 89 column 86 - Warning: unescaped & which should be written as & line 92 column 38 - Warning: <table> lacks "summary" attribute line 92 column 91 - Warning: <table> lacks "summary" attribute line 92 column 203 - Warning: <table> lacks "summary" attribute line 92 column 373 - Warning: <table> lacks "summary" attribute line 92 column 1,482 - Warning: <table> lacks "summary" attribute line 92 column 2,490 - Warning: unescaped & or unknown entity "&acc" line 92 column 2,674 - Warning: <table> lacks "summary" attribute line 92 column 3,479 - Warning: <table> lacks "summary" attribute line 92 column 3,702 - Warning: unescaped & or unknown entity "&cf" line 92 column 3,710 - Warning: unescaped & or unknown entity "&id" line 92 column 3,795 - Warning: unescaped & or unknown entity "&cf" line 92 column 3,803 - Warning: unescaped & or unknown entity "&id" line 92 column 3,886 - Warning: unescaped & or unknown entity "&cf" line 92 column 3,894 - Warning: unescaped & or unknown entity "&id" line 92 column 3,921 - Warning: unescaped & which should be written as & line 92 column 4,435 - Warning: <table> lacks "summary" attribute line 100 column 17 - Warning: unescaped & which should be written as & line 102 column 24 - Warning: unescaped & which should be written as & line 172 column 226 - Warning: <table> lacks "summary" attribute line 172 column 358 - Warning: <img> lacks "alt" attribute line 177 column 36 - Warning: missing </small> before <p> line 177 column 39 - Warning: inserting implicit <small> "InputStream" appears to be HTML 3.2 43 warnings/errors were found! HTML source of www.yahoo.com =========================== <html><head><title>Yahoo!</title><base href=http://www.yahoo.com/><meta http-equiv="PICS-Label" content='(PICS-1.1 "http://www.rsac.org/ratingsv01.html" l gen true for "http://www.yahoo.com" r (n 0 s 0 v 0 l 0))'></head><body><center><form action=http://search.yahoo.com/bin/search><map name=m><area coords="0,0,52,52" href=r/a1><area coords="53,0,121,52" href=r/p1><area coords="122,0,191,52" href=r/m1><area coords="441,0,510,52" href=r/wn><area coords="511,0,579,52" href=r/i1><area coords="580,0,637,52" href=r/hw></map><img width=638 height=53 border=0 usemap="#m" src=http://us.a1.yimg.com/us.yimg.com/i/ww/m5v2.gif alt=Yahoo><br><table border=0 cellspacing=0 cellpadding=3 width=640><tr><td align=center width=205> <!-- SpaceID=0 robot --> <img src="http://us.a1.yimg.com/us.yimg.com/i/new2.gif" height=11 width=28><br><a href="/homet/?http://in.yahoo.com"><b>Yahoo! India</b></a></td><td align=center><a href="http://messenger.yahoo.com"><img width=230 height=33 src="http://us.a1.yimg.com/us.yimg.com/a/ya/yahoopager/messenger/messengerpromo.gif" alt="Yahoo! Messenger" border=0></a></td><td align=center width=205><a href="/homet/?http://mail.yahoo.com"><b>Yahoo! Mail</b></a><br>free email for life</td></tr><tr><td colspan=3 align=center><input size=30 name=p> <input type=submit value=Search> <a href=r/so>advanced search</a></td></tr></table><table border=0 cellspacing=0 cellpadding=3 width=640><tr><td nowrap align=center><small><b>Shop</b> · <a href=r/a2>Auctions</a> · <a href=r/cf>Classifieds</a> · <a href=r/sh>Shopping</a> · <a href=r/ta>Travel</a> · <a href=r/yp>Yellow Pgs</a> · <a href=r/mp>Maps</a> <b>Media</b> · <a href=r/dn>News</a> · <a href=r/ys>Sports</a> · <a href=r/sq><b>Stock Quotes</b></a> · <a href=r/tg>TV</a> · <a href=r/wt>Weather</a><br><b>Connect</b> · <a href=r/yc>Chat</a> · <a href=r/ub>Clubs</a> · <a href=r/pl>Games</a> · <a href=r/gc>GeoCities</a> · <a href=r/gr>Greetings</a> · <a href=r/v1>Invites</a> · <a href=r/m2><b>Mail</b></a> · <a href=r/p2>Messenger</a> · <a href=r/pr>Personals</a> · <a href=r/ps>People Search</a> · <a href=r/yg>For Kids</a><br><b>Personal</b> · <a href=r/i2>My Yahoo!</a> · <a href=r/ab>Addr Book</a> · <a href=r/ca>Calendar</a> · <a href=r/bc>Briefcase</a> · <a href=r/fo>Photos</a> · <a href=r/al>Alerts</a> · <a href=r/bm>Bookmarks</a> · <a href=r/cm>Companion</a> · <a href=r/bp>Bill Pay</a> <a href=r/xy><b>more...</b></a></small></td></tr></table><table border=0 cellspacing=7 cellpadding=2><tr><td valign=top align=center><table cellspacing=0 border=0 width="100%"><tr><td align=center bgcolor=3366cc><table cellspacing=0 cellpadding=2 border=0 width="100%"><tr><td align=center bgcolor=ffffff><table cellspacing=0 border=0 width="100%"><tr><td align=center colspan=4><font face=arial><a href=/r/s/1><b>Yahoo! Shopping</b></a></font><small> - Thousands of stores. Millions of products.</small></td></tr><tr><td colspan=2><font face=arial size=2><b>Departments</b></font></td><td><font face=arial size=2><b>Stores</b></font></td><td><font face=arial size=2><b>Features</b></font></td></tr><tr><td valign=top width="22%"><small>· <a href=r/s/2>Apparel</a><br>· <a href=r/s/lux>Luxury</a><br>· <a href=r/s/4>Computers</a><br>· <a href=r/s/5>Electronics</a></small></td><td valign=top width="22%"><small>· <a href=r/s/12>Beauty</a><br>· <a href=r/s/11>Sports</a><br>· <a href=r/s/7>Music</a><br>· <a href=r/s/9>Video/DVD</a></small></td><td valign=top width="31%"><small> · <a href=r/s/macys>Macy's</a><br> · <a href=r/s/tav>Tavolo</a><br> · <a href=r/s/sep>Sephora</a><br> · <a href=r/s/br>Banana Republic</a><br> </small></td><td valign=top width="25%"><small> · <a href=r/s/sumfun>Summer Fun</a><br> · <a href=r/s/specoff>Special Offers</a><br> · <a href=r/s/topcd>Top-selling CDs</a><br> · <a href=r/s/g25>Gift Ideas</a><br> </small></td></tr></table></td></tr></table></td></tr></table> <p><table border=0 cellspacing=0 cellpadding=4><tr><td valign=top nowrap><small><font size=3 face=arial><a href=r/ar><b>Arts & Humanities</b></a></font><br><a href=r/li>Literature</a>, <a href=r/ph>Photography</a>...<br><br><font size=3 face=arial><a href=r/bu><b>Business & Economy</b></a></font><br><a href=r/bb>B2B</a>, <a href=r/fi>Finance</a>, <a href=r/bs>Shopping</a>, <a href=r/jo>Jobs</a>...<br><br><font size=3 face=arial><a href=r/ci><b>Computers & Internet</b></a></font><br><a href=r/in>Internet</a>, <a href=r/ww>WWW</a>, <a href=r/sf>Software</a>, <a href=r/ga>Games</a>...<br><br><font size=3 face=arial><a href=r/ed><b>Education</b></a></font><br><a href=r/un>College and University</a>, <a href=r/k2>K-12</a>...<br><br><font size=3 face=arial><a href=r/en><b>Entertainment</b></a></font><br><a href=r/cl>Cool Links</a>, <a href=r/mo>Movies</a>, <a href=r/hu>Humor</a>, <a href=r/mu>Music</a>...<br><br><font size=3 face=arial><a href=r/go><b>Government</b></a></font><br><a href=r/el>Elections</a>, <a href=r/mi>Military</a>, <a href=r/la>Law</a>, <a href=r/tx>Taxes</a>...<br><br><font size=3 face=arial><a href=r/he><b>Health</b></a></font><br><a href=r/md>Medicine</a>, <a href=r/ds>Diseases</a>, <a href=r/dg>Drugs</a>, <a href=r/ft>Fitness</a>...</small></td><td valign=top nowrap><small><font size=3 face=arial><a href=r/nm><b>News & Media</b></a></font><br><a href=r/fc>Full Coverage</a>, <a href=r/nw>Newspapers</a>, <a href=r/tv>TV</a>...<br><br><font size=3 face=arial><a href=r/rs><b>Recreation & Sports</b></a></font><br><a href=r/sp>Sports</a>, <a href=r/tr>Travel</a>, <a href=r/au>Autos</a>, <a href=r/od>Outdoors</a>...<br><br><font size=3 face=arial><a href=r/rf><b>Reference</b></a></font><br><a href=r/lb>Libraries</a>, <a href=r/dc>Dictionaries</a>, <a href=r/qt>Quotations</a>...<br><br><font size=3 face=arial><a href=r/re><b>Regional</b></a></font><br><a href=r/ct>Countries</a>, <a href=r/rg>Regions</a>, <a href=r/us>US States</a>...<br><br><font size=3 face=arial><a href=r/sc><b>Science</b></a></font><br><a href=r/am>Animals</a>, <a href=r/as>Astronomy</a>, <a href=r/eg>Engineering</a>...<br><br><font size=3 face=arial><a href=r/ss><b>Social Science</b></a></font><br><a href=r/ac>Archaeology</a>, <a href=r/ec>Economics</a>, <a href=r/lg>Languages</a>...<br><br><font size=3 face=arial><a href=r/cu><b>Society & Culture</b></a></font><br><a href=r/pe>People</a>, <a href=r/ev>Environment</a>, <a href=r/rl>Religion</a>...</small></td></tr></table></td> <td align=right valign=top width=200><table bgcolor=3366cc border=0 cellspacing=0><tr><td><table bgcolor=ffffff cellpadding=2 cellspacing=0 border=0><tr><td align=center bgcolor=e0d0b0 nowrap colspan=2><table border=0 cellspacing=0 cellpadding=0 width=180><tr><td align=center><font face=arial size=2><b>In the News</b></font></td></tr></table></td></tr><tr><td colspan=2><table cellpadding=0 cellspacing=0 border=0><tr><td valign=top><b> · </b></td><td width=100%><small><a href="/homer/?http://fullcoverage.yahoo.com/fc/Breaking/Northern_Ireland_Conflict/">British troops rushed into Belfast as violence persists</a></small></td></tr><tr><td valign=top><b> · </b></td><td><small><a href="/homer/?http://fullcoverage.yahoo.com/fc/Breaking/Jordan/">15 wounded in foiled Jordan plane hijack</a></small></td></tr><tr><td valign=top><b> · </b></td><td><small><a href="/homer/?http://fullcoverage.yahoo.com/fc/Breaking/Middle_East_Peace_Process/">Clinton to hold Mideast summit in July</a></small></td></tr><tr><td valign=top><b> · </b></td><td><small><a href="/homer/?http://sports.yahoo.com/ten/">Wimbledon</a> - <a href="/homer/?http://uk.sports.yahoo.com/tdf2000/">Tour de France</a></small></td></tr><tr><td align=right colspan=2><a href=r/xn><small>more...</small></a></td></tr></table></td></tr><tr><td align=center bgcolor=e0d0b0 colspan=2><font face=arial size=2><b>Marketplace</b></font></td></tr><tr><td colspan=2><table cellpadding=0 cellspacing=0 border=0><tr><td valign=top><b> · </b></td><td width=100%><small>Free <a href=/homer/?http://www.bluelight.com/freeinternet/>56K Internet Access</a></small></td></tr><tr><td valign=top><b> · </b></td><td><small>Looking for a <a href="/homer/?http://autos.yahoo.com">car</a>? <a href="/homer/?http://careers.yahoo.com">job</a>? <a href="/homer/?http://realestate.yahoo.com">house</a>? <a href=http://realestate.yahoo.com/realestate/renting/>apartment?</a></small></td></tr><tr><td valign=top><b> · </b></td><td><small><a href="/homer/?http://auctions.yahoo.com">Y! Auctions</a> - <a href="/homer/?http://list.auctions.yahoo.com/44465-category-leaf.html">Pokemon</a>, <a href="/homer/?http://list.auctions.yahoo.com/40602-category-leaf.html">Longaberger</a>, <a href="/homer/?http://list.auctions.yahoo.com/26360-category-leaf.html">autos</a>, <a href="/homer/?http://list.auctions.yahoo.com/40291-category-leaf.html?alocale=1us&acc=us">'N Sync</a></small></td></tr></table></td></tr><tr><td align=center bgcolor=e0d0b0 colspan=2><font face=arial size=2><b>Broadcast Events</b></font></td></tr><tr><td colspan=2><table cellpadding=0 cellspacing=0 border=0><tr><td valign=top><b> · </b></td><td width=100%><small>9am ET : <a href="/homer/?http://uk.sports.yahoo.com/tdf2000/bdcst/">Tour de France</a> - stage 6</small></td></tr><tr><td valign=top><b> · </b></td><td><small>12pm : <a href="/homer/?http://www.broadcast.com/sports/golf/pgatour/tournaments/western/">Western Open</a> - first round</small></td></tr><tr><td valign=top><b> · </b></td><td><small>3pm : <a href="/homer/?http://chat.yahoo.com/">Chat with Destiny's Child</a></small></td></tr><tr><td align=right colspan=2><a href=r/xb><small>more...</small></a></td></tr></table></td></tr><tr><td align=center bgcolor=e0d0b0 colspan=2><font face=arial size=2><b>Inside Yahoo!</b></font></td></tr><tr><td colspan=2><table cellpadding=0 cellspacing=0 border=0><tr><td valign=top><b> · </b></td><td width=100%><small><a href="/homer/?http://movies.yahoo.com">Y! Movies</a> - <a href="/homer/?http://movies.yahoo.com/shop?d=hv&cf=info&id=1800352372">Perfect Storm</a>, <a href="/homer/?http://movies.yahoo.com/shop?d=hv&cf=info&id=1800353825">The Patriot</a>, <a href="/homer/?http://movies.yahoo.com/shop?d=hv&cf=info&id=1800359669">Me, Myself & Irene</a></small></td></tr><tr><td valign=top><b> · </b></td><td><small><a href=/homer/?http://radio.yahoo.com>Yahoo! Radio</a> - tune in to your favorite station</small></td></tr><tr><td valign=top><b> · </b></td><td><small><a href="/homer/?http://baseball.fantasysports.yahoo.com/baseball">Fantasy Baseball</a> - midseason sign-ups, <font color=ff0020>5 days left!</font></small></td></tr></table></td></tr></table></td></tr></table></td></tr></table><hr noshade size=1 width=640><table border=0 cellspacing=0><tr><td><small><b>Local Yahoo!s</b></small></td></tr><tr><td nowrap><small><i>Europe</i> : <a href=r/dk>Denmark</a> - <a href=r/fr>France</a> - <a href=r/de>Germany</a> - <a href=r/it>Italy</a> - <a href=r/no>Norway</a> - <a href=r/es>Spain</a> - <a href=r/se>Sweden</a> - <a href=r/uk>UK & Ireland</a></small></td></tr><tr><td nowrap><small><i>Asia Pacific</i> : <a href=r/ai>Asia</a> - <a href=r/an>Australia & NZ</a> - <a href=r/cc>China</a> - <a href=r/cn>Chinese</a> - <a href=r/hk>HK</a> - <a href=r/id><b>India</b></a> - <a href=r/jp>Japan</a> - <a href=r/kr>Korea</a> - <a href=r/sg>Singapore</a> - <a href=r/tw>Taiwan</a></small></td></tr><tr><td nowrap><small><i>Americas</i> : <a href=r/ag>Argentina</a> - <a href=r/br>Brazil</a> - <a href=r/cd>Canada</a> - <a href=r/mx>Mexico</a> - <a href=r/ep>Spanish</a></small></td></tr><tr><td nowrap><small><i>U.S. Cities</i> : <a href=r/at>Atlanta</a> - <a href=r/bo>Boston</a> - <a href=r/ch>Chicago</a> - <a href=r/df>Dallas/FW</a> - <a href=r/lo>LA</a> - <a href=r/ny>NYC</a> - <a href=r/ba>SF Bay</a> - <a href=r/wd>Wash. DC</a> - <a href=r/mm><b>more...</b></a></small></td></tr><tr><td><small> </small></td></tr><tr><td><small><b>More Yahoo!s</b></small></td></tr><tr><td valign=top><small><i>Guides</i> : <a href=r/ya>Autos</a> - <a href=r/em>Careers</a> - <a href=r/yh>Health</a> - <a href=r/ou>Outdoors</a> - <a href=r/pt>Pets</a> - <a href=r/yr>Real Estate</a> - <a href=r/k1>Yahooligans!</a></small></td></tr><tr><td nowrap><small><i>Entertainment</i> : <a href=r/ye>Top</a> - <a href=r/ho>Astrology</a> - <a href=r/yb>Broadcast</a> - <a href=r/g2>Games</a> - <a href=r/mv>Movies</a> - <a href=r/rk>Music</a> - <a href=r/ne>Net Events</a> - <a href=r/t2>Television</a></small></td></tr><tr><td nowrap><small><i>Finance</i> : <a href=r/f2>Top</a> - <a href=r/bk>Banking</a> - <a href=r/b2>Bill Pay</a> - <a href=r/is>Insurance</a> - <a href=r/ln>Loans</a> - <a href=r/tc>Taxes</a> - <a href=r/fv>FinanceVision</a></small></td></tr><tr><td nowrap><small><i>Local</i> : <a href=r/l2>Top</a> - <a href=r/c2>Classifieds</a> - <a href=r/le>Events</a> - <a href=r/ld>Lodging</a> - <a href=r/m3>Maps</a> - <a href=r/rt>Restaurants</a> - <a href=r/y2>Yellow Pages</a></small></td></tr><tr><td nowrap><small><i>News</i> : <a href=r/d2>Top Stories</a> - <a href=r/bn>Business</a> - <a href=r/gn>Entertainment</a> - <a href=r/lt>Lottery</a> - <a href=r/sw>Politics</a> - <a href=r/s2>Sports</a> - <a href=r/tn>Technology</a> - <a href=r/w2>Weather</a></small></td></tr><tr><td nowrap><small><i>Publishing</i> : <a href=r/b3>Briefcase</a> - <a href=r/u2>Clubs</a> - <a href=r/iv>Invites</a> - <a href=r/f3>Photos</a> - <a href=r/g3>Home Pages</a> - <a href=r/m4>Message Boards</a> - <a href=r/st>Store</a></small></td></tr><tr><td nowrap><small><i>Small Business</i> : <a href=r/sb>Top</a> - <a href=r/fe>Business Marketplace</a> - <a href=r/ws>Website Hosting</a> - <a href=r/s3>Store Building</a></small></td></tr><tr><td nowrap><small><i>Yahoo! Everywhere</i> : <a href=r/o2>Get Yahoo!</a> on your PDA, Cell Phone, Web Phone, or Pager</small></td></tr></table><hr noshade size=1 width=640><table border=0 cellspacing=6 cellpadding=0><tr><td align=right><a href=r/vs><small>Yahoo! prefers</small></a></td><td><a href=r/vs><img width=37 height=23 border=0 src=http://us.a1.yimg.com/us.yimg.com/a/vi/visa/sm.gif></a></td></tr></table><small><a href=r/ad>How to Suggest a Site</a> - <a href=r/cp>Company Info</a> - <a href=r/cy>Copyright Policy</a> - <a href=r/ts>Terms of Service</a> - <a href=r/cb>Contributors</a> - <a href=r/hr>Openings at Yahoo!</a><p>Copyright © 2000 Yahoo! Inc. All rights reserved.</small><br><a href=r/pv>Privacy Policy</a></form></center></body></html> XML Converted ( it stop unpredicable) ================================ <?xml version="1.0" encoding="UTF-8"?> <html> <head> <meta content="HTML Tidy, see www.w3.org" name="generator"> </meta> <title> Yahoo!</title> <base href="http://www.yahoo.com/"> </base> <meta content="(PICS-1.1 "http://www.rsac.org/ratingsv01.html" l gen true for "http://www.yahoo.com" r (n 0 s 0 v 0 l 0))" http-equiv="PICS-Label"> </meta> </head> <body> <center> <form action="http://search.yahoo.com/bin/search"> <map name="m" id="m"> <area href="r/a1" coords="0,0,52,52"> </area> <area href="r/p1" coords="53,0,121,52"> </area> <area href="r/m1" coords="122,0,191,52"> </area> <area href="r/wn" coords="441,0,510,52"> </area> <area href="r/i1" coords="511,0,579,52"> </area> <area href="r/hw" coords="580,0,637,52"> </area> </map> <img alt="Yahoo" src="http://us.a1.yimg.com/us.yimg.com/i/ww/m5v2.gif" usemap="#m" border="0" height="53" width="638"> </img> <br> </br> <table width="640" cellpadding="3" cellspacing="0" border="0"> <tr> <td width="205" align="center"> <img width="28" height="11" src="http://us.a1.yimg.com/us.yimg.com/i/new2.gif"> </img> <br> </br> <a href="/homet/?http://in.yahoo.com"> <b> Yahoo! India</b> </a> </td> <td align="center"> <a href="http://messenger.yahoo.com"> <img border="0" alt="Yahoo! Messenger" src="http://us.a1.yimg.com/us.yimg.com/a/ya/yahoopager/messenger/messengerpromo.gif" height="33" width="230"> </img> </a> </td> <td width="205" align="center"> <a href="/homet/?http://mail.yahoo.com"> <b> Yahoo! Mail</b> </a> <br> </br> free email for life</td> </tr> <tr> <td align="center" colspan="3"> <input name="p" size="30"> </input> <input value="Search" type="submit"> </input> <a href="r/so"> advanced search</a> </td> </tr> </table>
Received on Thursday, 6 July 2000 11:58:57 UTC