Re: Invalid escapes... [Was: Re: Error in HTML Tidy Beta

* Larry W. Virden wrote:
>Does anyone know of a technical document that might discuss the appropriate
>behavior by a program parsing html that indicates appropriate alternatives
>for handling invalid escapes?  For instance, if a program hits the html
>string
><A HREF="http://www.somestory.com/story1.html">hit&run accident</a>
>
>what are the recommended (or perhaps required) behaviors in interpreting
>&run?

"&run " is canonically equivalent to "&run; " in SGML with the HTML 4
SGML declaration, thus it would be treated as entity reference.

>  Some applications seem to leave things alone, some delete the invalid
>escapes, and some replace the escape with an 'error' character...  Are all
>these 'correct' behaviors?

Section B.1 of HTML 4 recommends "If it encounters an undeclared entity,
the entity should be treated as character data.", i.e. rendered as the
string "&run ". In general it is an error in the document and HTML 4
does not define how to deal with general error conditions.
-- 
Björn Höhrmann { mailto:bjoern@hoehrmann.de } http://www.bjoernsworld.de
am Badedeich 7 } Telefon: +49(0)4667/981028 { http://bjoern.hoehrmann.de
25899 Dagebüll { PGP Pub. KeyID: 0xA4357E78 } http://www.learn.to/quote/

Received on Wednesday, 17 October 2001 08:56:59 UTC