W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2002

Re: pb with entities FOUND IT

From: Christophe Strobbe <christophe.strobbe@esat.kuleuven.ac.be>
Date: Wed, 04 Dec 2002 17:35:49 +0100
Message-Id: <>
To: Riccardo Cohen <rcohen@dial.oleane.com>, html-tidy@w3.org

Hi Riccardo,

At 18:15 4/12/2002, Riccardo Cohen wrote:

>by the way, is this a bug in tidy, or is it normal that without doctype, 
>&eacute; cant
>be generated ? (from my point of view this behavior is not normal, but I 
>dont know very well standards)

In HTML, the DOCTYPE is actually required; see 
"A valid HTML document declares what version of HTML is used in the 
document. The document type declaration names the document type definition 
(DTD) in use for the document (see [ISO8879])."
The DTD (whether strict, transitional, or frameset) contains a reference to 
"HTMLlat1.ent", which contains the ISO Latin 1 character entities, so 
without the DOCTYPE, you can't always expect an HTML parser to know about 

This may not be an explanation of the behaviour of Tidy, but it's relevant 
to parsing generally.


Received on Wednesday, 4 December 2002 11:35:32 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:52 UTC