Doctype + Processinginstruction -error

Hi there,

I have t o explain a little more excaclyt what i am trying to do.

I want to convert a html file to xhtml and then i want to add to this
xhtml file some processing instructions.To do so, i take Jtidy and get
the Dom tree of the html document.

org.w3c.dom.Document w3cdoc = tidy2.parseDOM(fis,null);

when i am now trying to add a processing-instruction to this Dom (with
the following code) i get an exception

 w3cdoc.createProcessingInstruction("foo","foo");

org.w3c.tidy.DOMExceptionImpl: HTML document
at
org.w3c.tidy.DOMDocumentImpl.createProcessingInstruction(DOMDocumentImpl.java:166)



Ok i think have to do a workaround.

I first serialize this html - dom , parse it in again in and then add
the processing and serialize it again.That do the job, but if there is
the following Doctype declaration in the html file

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

and when iam trying to parse the tidied xhtml file i get the following
exception :
(setDocType("strict");  does nothing)

org.jdom.JDOMException: Error on line 1 of document
White space is required between the public identifier and the system
identifier.

 at org.jdom.input.SAXBuilder.build(SAXBuilder.java:403)
 at org.jdom.input.SAXBuilder.build(SAXBuilder.java:464)
 at org.jdom.input.SAXBuilder.build(SAXBuilder.java:445)
Root cause: org.xml.sax.SAXParseException: White space is required
between the public identifier and the system identifier.
 at
org.apache.xerces.framework.XMLParser.reportError(XMLParser.java:1008)
 at
org.apache.xerces.framework.XMLDTDScanner.reportFatalXMLError(XMLDTDScanner.java:645)

 at
org.apache.xerces.framework.XMLDTDScanner.scanExternalID(XMLDTDScanner.java:1190)

 at
org.apache.xerces.framework.XMLDTDScanner.scanDoctypeDecl(XMLDTDScanner.java:1098)

 at
org.apache.xerces.framework.XMLDocumentScanner.scanDoctypeDecl(XMLDocumentScanner.java:2177)

 at
org.apache.xerces.framework.XMLDocumentScanner.access$0(XMLDocumentScanner.java:2133)

 at
org.apache.xerces.framework.XMLDocumentScanner$XMLDeclDispatcher.dispatch(XMLDocumentScanner.java:775)

 at
org.apache.xerces.framework.XMLDocumentScanner.parseSome(XMLDocumentScanner.java:380)

 at org.apache.xerces.framework.XMLParser.parse(XMLParser.java:900)
 at org.jdom.input.SAXBuilder.build(SAXBuilder.java:395)
 at org.jdom.input.SAXBuilder.build(SAXBuilder.java:464)
 at org.jdom.input.SAXBuilder.build(SAXBuilder.java:445)

can anybody help me ?

thx Holger

Received on Monday, 28 May 2001 04:21:49 UTC