- From: Andrea Sparling <asparling@adrelevance.com>
- Date: Tue, 06 Jun 2000 09:13:00 -0700
- To: html-tidy@w3.org
The stack pointer was decremented to 0 before the insertedToken call. This does not explain why. But if you change // this will only be null if inode != null if (this.insert == -1) { node = this.inode; this.inode = null; return node; to // this will only be null if inode != null if ((this.insert == -1) || ( this.istack.size() < 1) { node = this.inode; this.inode = null; return node; This may help. Further inspection of why this code decrements the stack pointer is in order. http://www.dlib.org/dlib/september98/millman/09millman.htm > > From: Donna Bergmark (bergmark@CS.Cornell.EDU) > Date: Mon, Jun 05 2000 > > *Next message: Jelks Cabaniss: "W3C validator (was: Strict tables)" > > * Previous message: Bertilo Wennergren: "Strict tables" > * Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] > * Other mail archives: [this mailing list] [other W3C mailing lists] > * Mail actions: [ respond to this message ] [ mail a new topic ] > > ------------------------------------------------------------------------ > > Message-Id: <200006051808.OAA07367@elgin.cs.cornell.edu> > To: html-tidy@w3.org > Date: Mon, 05 Jun 2000 14:08:33 -0400 > From: Donna Bergmark <bergmark@CS.Cornell.EDU> > Subject: [Tidy/Jtidy bug report]ArrayIndexOutOfBoundsException > > There still seems to be a problem with ArrayIndexOutOfBounds. > I am running Tidy Version 30 April 2000 and JTidy Version 3 June. > A variant of the Tidy Java Bean example was used. Here is the > URL that won't parse: > http://www.dlib.org/dlib/september98/millman/09millman.html > > Here is the typescript of the run: > Script started on Mon Jun 5 13:44:12 2000 > ------------------------------------------------------------------ > Latest version of Tidy, JTidy still bombs on ArrayIndexOutOfBounds > ------------------------------------------------------------------ > (1) Put 3 June 2000 JTidy into my path > > DHCP211-162.CS.CORNELL.EDU% setenv CLASSPATH \ > ? /home/bergmark/public/src/tools/JTidy/src/30apr2000:/usr/local/jdk1.2.2/lib:. > > DHCP211-162.CS.CORNELL.EDU% echo $CLASSPATH > /home/bergmark/public/src/tools/JTidy/src/30apr2000:/usr/local/jdk1.2.2/lib:. > > (2) Get original version of the JTidy bean (copied from the Java HTML Tidy > document). Compile it. > > DHCP211-162.CS.CORNELL.EDU% co -r1.1 Test16.java > RCS/Test16.java,v --> Test16.java > revision 1.1 > writable Test16.java exists; remove it? [ny](n): y > done > > DHCP211-162.CS.CORNELL.EDU% javac Test16.java > > (3) Run it on the URL that causes Tidy/JTidy to crash > > DHCP211-162.CS.CORNELL.EDU% cat millman.notidy > http://www.dlib.org/dlib/september98/millman/09millman.html > > DHCP211-162.CS.CORNELL.EDU% java Test16 \ > ? http://www.dlib.org/dlib/september98/millman/09millman.html \ > ? out error > java.lang.ArrayIndexOutOfBoundsException: 0 >= 0 > at java.util.Vector.elementAt(Vector.java:405) > at org.w3c.tidy.Lexer.insertedToken(Lexer.java:2738) > at org.w3c.tidy.Lexer.getToken(Lexer.java:1185) > at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:1672) > at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:49) > at org.w3c.tidy.ParserImpl.access$0(ParserImpl.java:36) > at org.w3c.tidy.ParserImpl$ParseDefList.parse(ParserImpl.java:1438) > at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:49) > at org.w3c.tidy.ParserImpl.access$0(ParserImpl.java:36) > at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2002) > at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:49) > at org.w3c.tidy.ParserImpl.access$0(ParserImpl.java:36) > at org.w3c.tidy.ParserImpl$ParseBlock.parse(ParserImpl.java:2002) > at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:49) > at org.w3c.tidy.ParserImpl.access$0(ParserImpl.java:36) > at org.w3c.tidy.ParserImpl$ParseBody.parse(ParserImpl.java:652) > at org.w3c.tidy.ParserImpl.parseTag(ParserImpl.java:49) > at org.w3c.tidy.ParserImpl.access$0(ParserImpl.java:36) > at org.w3c.tidy.ParserImpl$ParseHTML.parse(ParserImpl.java:258) > at org.w3c.tidy.ParserImpl.parseDocument(ParserImpl.java:2917) > at org.w3c.tidy.Tidy.parse(Tidy.java:1055) > at Test16.run(Test16.java:50) > at java.lang.Thread.run(Thread.java:475) > > (4) Here is what was written into the output files > > DHCP211-162.CS.CORNELL.EDU% cat error > > Tidy (vers 30th April 2000) Parsing "InputStream" > line 10 column 61 - Warning: discarding unexpected </a> > line 16 column 23 - Warning: missing </font> before <h3> > line 16 column 27 - Warning: inserting implicit <font> > line 20 column 1 - Warning: inserting implicit <font> > line 23 column 2 - Warning: missing </font> before <h6> > line 23 column 5 - Warning: inserting implicit <font> > line 26 column 1 - Warning: inserting implicit <font> > line 29 column 23 - Warning: missing </font> before <h3> > line 29 column 23 - Warning: missing </font> before <h3> > line 29 column 27 - Warning: inserting implicit <font> > line 29 column 27 - Warning: inserting implicit <font> > line 30 column 1 - Warning: discarding unexpected </font> > line 32 column 1 - Warning: inserting implicit <font> > line 34 column 1 - Warning: missing </font> before <p> > line 35 column 1 - Warning: inserting implicit <font> > line 41 column 53 - Warning: discarding unexpected </i> > line 43 column 2 - Warning: missing </i> before <p> > line 43 column 2 - Warning: missing </font> before <p> > line 43 column 2 - Warning: missing </font> before <p> > line 47 column 3 - Warning: <img> lacks "alt" attribute > line 57 column 2 - Warning: missing </font> before <p> > line 57 column 2 - Warning: missing </h3> before <p> > line 58 column 1 - Warning: inserting implicit <font> > line 60 column 1 - Warning: discarding unexpected </h3> > line 60 column 6 - Warning: discarding unexpected </font> > line 60 column 13 - Warning: discarding unexpected </font> > line 60 column 20 - Warning: discarding unexpected </font> > line 72 column 1 - Warning: trimming empty <p> > line 72 column 55 - Warning: missing </font> before </h3> > line 72 column 60 - Warning: discarding unexpected </font> > line 72 column 67 - Warning: replacing element</p> by <br> > line 72 column 67 - Warning: inserting implicit <br> > line 99 column 1 - Warning: trimming empty <p> > line 99 column 45 - Warning: missing </font> before </h3> > line 99 column 50 - Warning: discarding unexpected </font> > line 99 column 57 - Warning: replacing element</p> by <br> > line 99 column 57 - Warning: inserting implicit <br> > line 105 column 281 - Warning: trimming empty <p> > line 106 column 646 - Warning: trimming empty <p> > line 107 column 129 - Warning: trimming empty <p> > line 108 column 356 - Warning: trimming empty <p> > line 109 column 115 - Warning: trimming empty <p> > line 115 column 5 - Warning: missing </em> before <dl> > line 115 column 5 - Warning: trimming empty <em> > line 116 column 5 - Warning: inserting implicit <em> > line 117 column 1 - Warning: missing <dd> > line 117 column 1 - Warning: discarding unexpected </em> > > DHCP211-162.CS.CORNELL.EDU% cat out > DHCP211-162.CS.CORNELL.EDU% > DHCP211-162.CS.CORNELL.EDU% exit > exit > > Script done on Mon Jun 5 13:46:39 2000 > .............................................................. > Here is the java code that invoked the parse: > // bergmark - may 2000 - Code example of how to use the Tidy Java Bean > > // Code copied from Java HTML Tidy (13 May 2000) document > > // CLASSPATH: must include path to JTidy: > // /home/bergmark/public/src/tools/JTidy/src/30apr2000 > > import java.io.IOException; > import java.net.URL; > import java.io.BufferedInputStream; > import java.io.FileOutputStream; > import java.io.PrintWriter; > import java.io.FileWriter; > import org.w3c.tidy.Tidy; > > /** > * This program shows how HTML could be tidied directly from > * a URL stream, and running on separate threads. Note the use > * of the 'parse' method to parse from an InputStream, and send > * the pretty-printed result to an OutputStream. > * In this example thread th1 outputs XML, and thread th2 outputs > * HTML. This shows that properties are per instance of Tidy. > */ > > public class Test16 implements Runnable { > > private String url; > private String outFileName; > private String errOutFileName; > > public Test16(String url, String outFileName, > String errOutFileName) { > this.url = url; > this.outFileName = outFileName; > this.errOutFileName = errOutFileName; > } > > public void run() { > URL u; > BufferedInputStream in; > FileOutputStream out; > Tidy tidy = new Tidy(); > > tidy.setXHTML(true); > try { > tidy.setErrout(new PrintWriter(new FileWriter(errOutFileName), true)); > u = new URL(url); > in = new BufferedInputStream(u.openStream()); > out = new FileOutputStream(outFileName); > tidy.parse(in, out); > } catch ( IOException e ) { > System.out.println ( this.toString() + e.toString() ); > } > } > > public static void main( String[] args ) { > Test16 t = new Test16(args[0], args[1], args[2] ); > Thread th1 = new Thread(t); > th1.start(); > } > } > > ------------------------------------------------------------------------ > > * Next message: Jelks Cabaniss: "W3C validator (was: Strict tables)" > * Previous message: Bertilo Wennergren: "Strict tables" > * Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] > * Other mail archives: [this mailing list] [other W3C mailing lists] > * Mail actions: [ respond to this message ] [ mail a new topic ] -- Andrea Sparling 206.576.3557 AdRelevance, a division of Media Metrix, Inc.
Received on Tuesday, 6 June 2000 12:13:37 UTC