Re: pb with entities FOUND IT

Hi Riccardo,


At 18:15 4/12/2002, Riccardo Cohen wrote:

>by the way, is this a bug in tidy, or is it normal that without doctype, 
>é cant
>be generated ? (from my point of view this behavior is not normal, but I 
>dont know very well standards)
>Thanks

In HTML, the DOCTYPE is actually required; see 
http://www.w3.org/TR/html401/struct/global.html#idx-document_type_declaration-3:
"A valid HTML document declares what version of HTML is used in the 
document. The document type declaration names the document type definition 
(DTD) in use for the document (see [ISO8879])."
The DTD (whether strict, transitional, or frameset) contains a reference to 
"HTMLlat1.ent", which contains the ISO Latin 1 character entities, so 
without the DOCTYPE, you can't always expect an HTML parser to know about 
entities.

This may not be an explanation of the behaviour of Tidy, but it's relevant 
to parsing generally.

Regards,

Christophe

Received on Wednesday, 4 December 2002 11:35:32 UTC