Re: XHTML character entity support from Henri Sivonen on 2009-11-13 (public-html@w3.org from November 2009)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Fri, 13 Nov 2009 14:03:14 +0200
To: James Graham <jgraham@opera.com>
Cc: John Cowan <cowan@ccil.org>, HTML WG <public-html@w3.org>
Message-Id: <7724A3D9-4265-401A-96EC-EA6687E09B10@iki.fi>

On Nov 13, 2009, at 14:00, James Graham wrote:

> John Cowan wrote:
>> James Graham scripsit:
>>> Note that Anne did some work in this area already:
>> That's interesting, although a little crude: some people at Extreme Markup
>> some years back presented a much cleverer algorithm for schemaless tag
>> recovery, given a tree to work with.  Unfortunately, the archives seem to be
>> offline.
> 
> I would be interested in seeing that, if you can dig up some kind of reference.
> 
> Note that a requirement is that the algorithm not need to use lookahead; it must be possible to implement an incremental, error handling, parser.

I had assumed that implementability as a truly streaming SAX parser was also an implicit requirement. (Hence, "given a tree to work with" would be unacceptable.)

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/

Received on Friday, 13 November 2009 12:03:49 UTC