W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2004

tidy strips fpi from DOCTYPE when pretty-printing xml

From: Christian Hattemer <chris@heaven.riednet.wh.tu-darmstadt.de>
Date: Tue, 2 Mar 2004 18:27:56 +0100
To: html-tidy@w3.org
Message-ID: <20040302172756.GA27988@mail.riednet.wh.tu-darmstadt.de>

Hi,

I use tidy to pretty print DocBook documents. The commandline is:

tidy -xml -i file.xml

Before using tidy the DOCTYPE looks like this:

<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">

afterwards it is:

<!DOCTYPE 
book SYSTEM "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd">

This is tidy compiled from source as of today.

I think the problem is in lexer.c/NewDocTypeNode():

    Node* html = FindHTML( doc );

    if ( !html )
            return NULL;

If there's no <html> Tag because this is no HTML but XML the processing
stops here. However this was only a quick look and I'm not sure about
that.

This should be fixed to keep both the public and the system identifier
as they are when tidying XML documents.

Bye, Chris
Received on Wednesday, 3 March 2004 04:27:48 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 07:15:53 UTC