tidy strips fpi from DOCTYPE when pretty-printing xml

Hi,

I use tidy to pretty print DocBook documents. The commandline is:

tidy -xml -i file.xml

Before using tidy the DOCTYPE looks like this:

<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">

afterwards it is:

<!DOCTYPE 
book SYSTEM "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">

This is tidy compiled from source as of today.

I think the problem is in lexer.c/NewDocTypeNode():

    Node* html = FindHTML( doc );

    if ( !html )
            return NULL;

If there's no <html> Tag because this is no HTML but XML the processing
stops here. However this was only a quick look and I'm not sure about
that.

This should be fixed to keep both the public and the system identifier
as they are when tidying XML documents.

Bye, Chris

Received on Wednesday, 3 March 2004 04:27:48 UTC