- From: Kim <kime@atlantic.net>
- Date: Tue, 6 Aug 2002 05:38:34 -0400
- To: html-tidy@w3.org
- Message-ID: <1028626714.3d4f991a83f40@webmail.atlantic.net>
Somehow this eneded up in my mailbox? It is from jamesgc21@attbi.com ----- Forwarded message from Greg James <jamesgc21@attbi.com> ----- Date: Mon, 5 Aug 2002 21:32:04 -0600 From: Greg James <jamesgc21@attbi.com> Reply-To: Greg James <jamesgc21@attbi.com> Subject: JTidy & Un-Tagged Text in HTML Doc To: html-tidy@w3.org I'm trying to use JTidy to convert HTML pages to XML. The HTML has several 'un- tagged' entries. For example: <P><A name=Hit3><B>3.</B></A> <A href="http://www.matrixscience.com/cgi/protein_view.pl? file=../data/20020130/FaioSfs.dat&hit=4">gi|11528046</A> <B>Mass:</B> 74711 <B>Score:</B> 43 (AF197556) coat protein [Beet necrotic yellow vein virus] <B> Observed Mr(expt) Mr(calc) Delta Start End Miss Peptide</B> 564.70 564.70 565.25 -0.55 168 - 171 0 FEDR 828.00 828.00 828.51 -0.51 44 - 51 0 AANLSIIK 1032.30 1032.30 1032.56 -0.26 509 - 519 0 AAVAMTALASK 2271.60 2271.60 2271.16 0.44 556 - 578 0 YVHTGIQGGAQLAGAMAVGAMLR <B>No match to:</B> 1021.10, 3511.70 Is there an easy way to get JTidy to 'tag' the un-tagged text? For example, the text between the <B>'s? I'd rather not right a java program to tag these lines prior to sending it to JTidy. I'm setting the following params on JTidy: tidy.setMakeClean(true); tidy.setBreakBeforeBR(true); tidy.setShowWarnings(false); tidy.setOnlyErrors(true); Thanks. ----- End forwarded message ----- -- ------------------------------------------------------------------------- This message was sent through Atlantic.Net Webmail. Sign up for fast, reliable dial-up service for only $19.95/mo. Visit www.atlantic.net to learn more.
Attachments
- text/html attachment: unnamed
Received on Tuesday, 6 August 2002 05:38:36 UTC