W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2002

Re: JTidy - Beginner's question

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Thu, 10 Oct 2002 00:53:58 +0200
To: "Christian Peter" <cpeter@rostock.igd.fhg.de>
Cc: html-tidy@w3.org
Message-ID: <3db7b286.7917905@smtp.bjoern.hoehrmann.de>

* Christian Peter wrote:
>And here's the things confusing me:
>
>First, the generated files start with
>
>   <html>
>   <head>
>   <meta name="generator" content="HTML Tidy, see www.w3.org" />
>
>rather than with
>
>   <?xml version="1.0" encoding="us-ascii"?>
>   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
>
>Why's that? This looks to me as if the output isn't set to XML at 
>all.

Sure it is but JTidy has for some reason omitted the document type
declaration and has not inserted a XML declaration. You can enforce
this using the appropriate configuration options. Does the document
you tried to tidy contain proprietary markup?

>Second, with quite a lot of sites (e.g. www.nasa.gov) I get a 
>parsing error when reading the generated file (with IE or Netscape):
>
>   XML Parsing Error: undefined entity
>   Location: file:///C:/prog/3DWS/JTidy/files/www.nasa.gov.xml
>   Line Number 208, Column 22:size="2">NASA en
>   Espa&ntilde;ol</font></a></td>
>
>Question: which settings are necessary to get this handled properly?

Let Tidy output either numeric character references or a document type
declaration pointing at a DTD that defines those entities.

Btw., there is a JTidy forum at

  http://sourceforge.net/forum/forum.php?forum_id=41436

where it is more likely to find people who can help you, this mailing
list is - strictly speaking - only for discussion of the C version
command line tool.
Received on Wednesday, 9 October 2002 18:53:24 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 23:39:48 UTC