W3C home > Mailing lists > Public > www-lib@w3.org > January to March 1996

HTML parser and incomplete entities

From: Romain Vignes <rvignes@cal.fr>
Date: 13 Feb 1996 16:38:49 +0200
Message-Id: <n1387892929.70645@Roms>
To: "W3C Lib Mailing List" <www-lib@w3.org>
I am using an old version of the W3C lib for parsing HTML files, and
I have discovered that the SGML parser is unable to handle entities that
are not terminated with ';'.

For example:

	d&eacute;ja	->	déja

	d&eacuteja	->	d&eacuteja

Unfortunately, most of the HTML source I am using is written for Netsape,
and the entities are not terminated with ';'.

Is there a new version of the W3C HTML parser that bypasses this
limitation, or does somebody modifiy the parser sources for that ?

Thanks in advance !

--
______________________________________________________________________
Romain Vignes                                     Computer Answer Line
Macintosh Software Engineer                           92, cours Vitton
E-mail: rvignes@cal.fr                             69006 LYON - FRANCE
Tel: +33 72 83 10 18                                 http://www.cal.fr
______________________________________________________________________
Received on Tuesday, 13 February 1996 10:39:42 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 23 April 2007 18:18:26 GMT