W3C home > Mailing lists > Public > w3c-wai-er-ig@w3.org > April 2000

Software Architecture

From: Leonard R. Kasday <kasday@acm.org>
Date: Mon, 10 Apr 2000 10:05:34 -0400
Message-Id: <4.2.2.20000407113726.00c37250@pop3.concentric.net>
To: w3c-wai-er-ig@w3.org
I'd like to start some discussions about the software architecture for the 
tools we're building.

For starters, here's a low level question: how to parse and process the HTML.

Personally, for WAVE, I've been using the perl HTML::Parser module, which 
does a lexical parse into start tags, end tags, comments, and text.  That 
makes it easy to do a tag by tag analysis, but I wind up doing a home made 
state machine to get beyond tags, and it gets a bit klugy.

I'm about to recode it to use a proper tree.  One possibility is perl's 
HTML::TreeBuilder which makes it easy to walk the tree, find parents, 
children, attribute values, etc.

Is there a better way to do this?  What would it buy to switch to Java or 
C++?  How much can we do with XSL?

Len


--
Leonard R. Kasday, Ph.D.
Institute on Disabilities/UAP, and
Department of Electrical Engineering
Temple University
423 Ritter Annex, Philadelphia, PA 19122

kasday@acm.org
http://astro.temple.edu/~kasday

(215) 204-2247 (voice)
(800) 750-7428 (TTY)
Received on Monday, 10 April 2000 10:05:09 GMT

This archive was generated by hypermail 2.2.0 + w3c-0.30 : Thursday, 9 June 2005 12:10:34 GMT