HTML parser and incomplete entities from Romain Vignes on 1996-02-13 (www-lib@w3.org from January to March 1996)

From: Romain Vignes <rvignes@cal.fr>
Date: 13 Feb 1996 16:38:49 +0200
To: "W3C Lib Mailing List" <www-lib@w3.org>
Message-Id: <n1387892929.70645@Roms>

I am using an old version of the W3C lib for parsing HTML files, and
I have discovered that the SGML parser is unable to handle entities that
are not terminated with ';'.

For example:

	d&eacute;ja	->	déja

	d&eacuteja	->	d&eacuteja

Unfortunately, most of the HTML source I am using is written for Netsape,
and the entities are not terminated with ';'.

Is there a new version of the W3C HTML parser that bypasses this
limitation, or does somebody modifiy the parser sources for that ?

Thanks in advance !

--
______________________________________________________________________
Romain Vignes                                     Computer Answer Line
Macintosh Software Engineer                           92, cours Vitton
E-mail: rvignes@cal.fr                             69006 LYON - FRANCE
Tel: +33 72 83 10 18                                 http://www.cal.fr
______________________________________________________________________

Received on Tuesday, 13 February 1996 10:39:42 UTC