Software Architecture

I'd like to start some discussions about the software architecture for the 
tools we're building.

For starters, here's a low level question: how to parse and process the HTML.

Personally, for WAVE, I've been using the perl HTML::Parser module, which 
does a lexical parse into start tags, end tags, comments, and text.  That 
makes it easy to do a tag by tag analysis, but I wind up doing a home made 
state machine to get beyond tags, and it gets a bit klugy.

I'm about to recode it to use a proper tree.  One possibility is perl's 
HTML::TreeBuilder which makes it easy to walk the tree, find parents, 
children, attribute values, etc.

Is there a better way to do this?  What would it buy to switch to Java or 
C++?  How much can we do with XSL?

Len


--
Leonard R. Kasday, Ph.D.
Institute on Disabilities/UAP, and
Department of Electrical Engineering
Temple University
423 Ritter Annex, Philadelphia, PA 19122

kasday@acm.org
http://astro.temple.edu/~kasday

(215) 204-2247 (voice)
(800) 750-7428 (TTY)

Received on Monday, 10 April 2000 10:05:09 UTC