Structure Analysis [was minutes from today's call

Re Al's suggestions on automatic recognition of structure in a page:

>"best fit" reasoning in
>its decisions, and it has a balance of bottom-up (depth first) and top-down
>(breadth first) methods for deciding what roles things play in the page.


>actually we could go a long way
>with just a library of structure templates and heuristics about which
>template-matching operations to try first.  Most pages have a head section
>and a foot section, and a heuristic recognizer would rarely be wrong in
>identifying their scope.  The inward-looking navigation bar would be harder
>to isolate if it weren't always down the left margin...

This could be encapsulated as a page "grammar" which would be parsed to 
recognize the structure.  The parse might be ambiguous... in which case 
there would need to be a heuristic, e.g. a weighting function, to select 
the best fit.  Time to dust off the old AI textbooks (or buy new 
ones...).  This is a _very_ interesting research problem.

Len





--
Leonard R. Kasday, Ph.D.
Institute on Disabilities/UAP, and
Department of Electrical Engineering
Temple University
423 Ritter Annex, Philadelphia, PA 19122

kasday@acm.org
http://astro.temple.edu/~kasday

(215) 204-2247 (voice)
(800) 750-7428 (TTY)

Received on Friday, 21 April 2000 09:14:02 UTC