- From: Christophe Strobbe <christophe.strobbe@esat.kuleuven.ac.be>
- Date: Wed, 04 Dec 2002 17:35:49 +0100
- To: Riccardo Cohen <rcohen@dial.oleane.com>, html-tidy@w3.org
Hi Riccardo, At 18:15 4/12/2002, Riccardo Cohen wrote: >by the way, is this a bug in tidy, or is it normal that without doctype, >é cant >be generated ? (from my point of view this behavior is not normal, but I >dont know very well standards) >Thanks In HTML, the DOCTYPE is actually required; see http://www.w3.org/TR/html401/struct/global.html#idx-document_type_declaration-3: "A valid HTML document declares what version of HTML is used in the document. The document type declaration names the document type definition (DTD) in use for the document (see [ISO8879])." The DTD (whether strict, transitional, or frameset) contains a reference to "HTMLlat1.ent", which contains the ISO Latin 1 character entities, so without the DOCTYPE, you can't always expect an HTML parser to know about entities. This may not be an explanation of the behaviour of Tidy, but it's relevant to parsing generally. Regards, Christophe
Received on Wednesday, 4 December 2002 11:35:32 UTC