[Prev][Next][Index][Thread]

SGML/HTML Lexical analyzer update



First, thanks for all the great feedback on the sgml-lex report and
code. I am happy to announce this release, which incorporates much of
it. Stay tuned to

	http://www.w3.org/pub/WWW/MarkUp/SGML/#sgml-lex

for details (including the tech report and source distribution).

The relavent excerpt is attached.

Recent changed include:


revision 1.8
date: 1996/02/07 15:32:31;  author: connolly;  state: Exp;  lines: +25 -14
* SGML_lexCase -> SGML_lexNorm, which covers whitespace etc.
	as well as case conversion. This allows pass-thru filtering.

	This involved changing the way whitespace is handled in the lexer.

	Also, tag close tokens (>) are explicitly reported.

	sgml_lex -c becomes sgml_lex -n

	@@ problem remaining: erroneous markup is reported out
		of order

* added filter test

* Fixed a bug in main.c reported in:
	From: Joris Roling <joris@altair.nl>
	To: "'Connolly, Dan'" <connolly@w3.org>
	Subject: Remarks on 'A Lexical Analyzer for HTML and Basic SGML'
	Date: Fri, 19 Jan 96 14:16:00 CET
	Message-Id: <30FF9AC4@msmsmtp>

* fixed lex spec bug reported in:

	Message-Id: <v01530502ad25cc1a251b@[206.86.76.80]>
	To: www-html@w3.org
	From: chris@walkaboutsoft.com (Chris Lovett)
	Subject: Re: Daniel Connolly's SGML Lex Specification

* fixed memory leak reported in:

	Message-Id: <01BAEB69.31095AA0@cadc140.cadvision.com>
	From: Simon Watfa <simonw@quadrus.com>
	To: "'www-html@w3.org'" <www-html@w3.org>
	Subject: sgml-lex
	Date: Thu, 25 Jan 1996 21:01:28 -0700


The one remaining major bug is that the case of malloc() returning
NULL treated as a fatal error (i.e. abort() is called).

The python support is still spotty. In fact, I haven't really tested
the python module this time.

A number of higher level APIs are needed, and to some extent planned.
The first thing is just something to reduce an attribute value literal ala:

	"abc&#65;&quot;def"

to its value:

	abcA"def



SGML and the Web

A Lexical Analyzer for HTML and Basic SGML
W3C Tech Report on SGML low-level parsing details. Includes flex spec, test file, and source distribution:
-rw-rw-r--   1 connolly 69          50650 Feb  7 11:59 sgml-lex-19960207.tar.gz
-rw-rw-r--   1 connolly 69          57182 Feb  7 12:00 sgml-lex-19960207.zip
21f7b70ec7135531bc84fd4c5e3cdf3d  sgml-lex-19960207.tar.gz (pgp sig)
083e21759d223b1005402120cdbf8169  sgml-lex-19960207.zip (pgp sig)