RE: Html-Tidy BUG ???

Hello

The answer is fairly simple: tidy's DOM implementation is not compatible
with W3C recommendation.

BR
VA

PS I think you should write a translator from tidy's implementation into
standard one (just copy the tree).

> -----Original Message-----
> From: ext Holger Prause [mailto:holger.prause@detewe.de]
> Sent: 12 June 2001 18:05
> To: html-tidy@w3.org
> Subject: Html-Tidy BUG ??? 
> 
> 
> Hi
> 
> 
> i am using Jtidy(html tidy) to get a DOM out of some html 
> files and then
> i get all Links (all Elements with nodename "a").Now i want 
> to take this
> dom and want it to
> process with XSLT
> 
> when i use the following Code  i get the following Exception
> 
> <pre>
> XSLTProcessor processor = XSLTProcessorFactory.getProcessor();
>         processor.process(new XSLTInputSource(doc),new
> XSLTInputSource(new FileInputStream(xslPath)),
>         new XSLTResultTarget(new FileOutputStream(outputFile)));
> </pre>
> 
> 
> XSL Error: Cannot use a DTMLiaison for a input DOM node... pass a
> org.apache.xalan.xpath.xdom.XercesLiaison instead!
> 
> XSL Error: SAX Exception
> 
> org.apache.xalan.xslt.XSLProcessorException:
>  at 
> org.apache.xalan.xslt.XSLTEngineImpl.error(XSLTEngineImpl.java:1799)
> 
>  at 
> org.apache.xalan.xslt.XSLTEngineImpl.error(XSLTEngineImpl.java:1691)
> 
>  
> atorg.apache.xalan.xslt.XSLTEngineImpl.getSourceTreeFromInput(
> XSLTEngineImpl.java:919)
> 
>  at
> org.apache.xalan.xslt.XSLTEngineImpl.process(XSLTEngineImpl.java:643)
>  at DOMToHtmlSerializer.serialize(DOMToHtmlSerializer.java:39)
>  at HtmlLinkValidator.validate(HtmlLinkValidator.java:56)
>  at Main.<init>(Main.java:44)
>  at Main.main(Main.java:55)
> 
> 
> Ok i thought , if he want it that way i pass a xerces liasion
> 
> <pre>
> XercesLiaison xl = new XercesLiaison();
>         XSLTProcessor processor = 
> XSLTProcessorFactory.getProcessor(xl);
> 
>         processor.process(new XSLTInputSource(doc),new
> XSLTInputSource(new FileInputStream(xslPath)),
>         new XSLTResultTarget(new FileOutputStream(outputFile)));
> </pre>
> 
> than i get the following exception
> XSL Error: SAX Exception
> 
> org.apache.xalan.xslt.XSLProcessorException: XercesLiaison can not
> handle nodes of type class org.w3c.tidy.DOMDocumentImpl
>  at 
> org.apache.xalan.xslt.XSLTEngineImpl.error(XSLTEngineImpl.java:1753)
> 
>  at 
> org.apache.xalan.xslt.XSLTEngineImpl.error(XSLTEngineImpl.java:1717)
> 
>  at
> org.apache.xalan.xslt.XSLTEngineImpl.process(XSLTEngineImpl.java:746)
>  at DOMToHtmlSerializer.serialize(DOMToHtmlSerializer.java:39)
>  at HtmlLinkValidator.validate(HtmlLinkValidator.java:56)
>  at Main.<init>(Main.java:44)
>  at Main.main(Main.java:55)
> 
> 
> "
> org.apache.xalan.xslt.XSLProcessorException: XercesLiaison can not
> handle nodes of type class org.w3c.tidy.DOMDocumentImpl             "
> 
> Why is JTidy using its own 
> DOMDocumentImpl(org.w3c.tidy.DOMDocumentImp)
> and not the  DOMDocumentImpl from w3c(org.w3c.dom.DOMDocumentImp) ?? (
> 
> This would have saved my a lot of time
> 
> 
> 
> Now what can i do ?
> 
> Solution 1: write the tidy-dom to disk and the reparse it with any
> xml-parser , and the process it
> 
> Solution 2.
> 
> write a wrapper wich changes the tidy-dom to an pure
> org.w3c.dom.Document
> and then process it
> 
> Solution 3 :
> Search for another tool doing it
> 
> 
> Hmm can anyone of u , especially the developers of this too / 
> libraryl,
> tell me what to do?
> 
> 

Received on Thursday, 14 June 2001 10:45:34 UTC