- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Wed, 17 Oct 2001 14:55:52 +0200
- To: "Larry W. Virden" <lvirden@cas.org>
- Cc: <html-tidy@w3.org>
* Larry W. Virden wrote: >Does anyone know of a technical document that might discuss the appropriate >behavior by a program parsing html that indicates appropriate alternatives >for handling invalid escapes? For instance, if a program hits the html >string ><A HREF="http://www.somestory.com/story1.html">hit&run accident</a> > >what are the recommended (or perhaps required) behaviors in interpreting >&run? "&run " is canonically equivalent to "&run; " in SGML with the HTML 4 SGML declaration, thus it would be treated as entity reference. > Some applications seem to leave things alone, some delete the invalid >escapes, and some replace the escape with an 'error' character... Are all >these 'correct' behaviors? Section B.1 of HTML 4 recommends "If it encounters an undeclared entity, the entity should be treated as character data.", i.e. rendered as the string "&run ". In general it is an error in the document and HTML 4 does not define how to deal with general error conditions. -- Björn Höhrmann { mailto:bjoern@hoehrmann.de } http://www.bjoernsworld.de am Badedeich 7 } Telefon: +49(0)4667/981028 { http://bjoern.hoehrmann.de 25899 Dagebüll { PGP Pub. KeyID: 0xA4357E78 } http://www.learn.to/quote/
Received on Wednesday, 17 October 2001 08:56:59 UTC