Software Architecture from Leonard R. Kasday on 2000-04-10 (w3c-wai-er-ig@w3.org from April 2000)

From: Leonard R. Kasday <kasday@acm.org>
Date: Mon, 10 Apr 2000 10:05:34 -0400
To: w3c-wai-er-ig@w3.org
Message-Id: <4.2.2.20000407113726.00c37250@pop3.concentric.net>

I'd like to start some discussions about the software architecture for the 
tools we're building.

For starters, here's a low level question: how to parse and process the HTML.

Personally, for WAVE, I've been using the perl HTML::Parser module, which 
does a lexical parse into start tags, end tags, comments, and text.  That 
makes it easy to do a tag by tag analysis, but I wind up doing a home made 
state machine to get beyond tags, and it gets a bit klugy.

I'm about to recode it to use a proper tree.  One possibility is perl's 
HTML::TreeBuilder which makes it easy to walk the tree, find parents, 
children, attribute values, etc.

Is there a better way to do this?  What would it buy to switch to Java or 
C++?  How much can we do with XSL?

Len


--
Leonard R. Kasday, Ph.D.
Institute on Disabilities/UAP, and
Department of Electrical Engineering
Temple University
423 Ritter Annex, Philadelphia, PA 19122

kasday@acm.org
http://astro.temple.edu/~kasday

(215) 204-2247 (voice)
(800) 750-7428 (TTY)

Received on Monday, 10 April 2000 10:05:09 UTC