Re: Structure Analysis from Al Gilman on 2000-04-21 (w3c-wai-er-ig@w3.org from April 2000)

From: Al Gilman <asgilman@iamdigex.net>
Date: Fri, 21 Apr 2000 11:28:56 -0500
To: w3c-wai-er-ig@w3.org
Message-Id: <200004211526.LAA461051@smtp1.mail.iamworld.net>
At 09:14 AM 2000-04-21 -0400, Leonard R. Kasday wrote:
>Re Al's suggestions on automatic recognition of structure in a page:
>
>>"best fit" reasoning in
>>its decisions, and it has a balance of bottom-up (depth first) and top-down
>>(breadth first) methods for deciding what roles things play in the page.
>
>
>>actually we could go a long way
>>with just a library of structure templates and heuristics about which
>>template-matching operations to try first.  Most pages have a head section
>>and a foot section, and a heuristic recognizer would rarely be wrong in
>>identifying their scope.  The inward-looking navigation bar would be harder
>>to isolate if it weren't always down the left margin...
>
>This could be encapsulated as a page "grammar" which would be parsed to 
>recognize the structure.  The parse might be ambiguous... in which case 
>there would need to be a heuristic, e.g. a weighting function, to select 
>the best fit.  Time to dust off the old AI textbooks (or buy new 
>ones...).  This is a _very_ interesting research problem.

1) I believe that something very much like this is the missing link in the
dialog between GL and UA which is going to happen on the 27th.

The technical refinement on the "grammar" idea is that it is an
object-oriented model.  It is a template for in-core information graphs,
not a grammar that starts from linear text.  But that is a minor point.
The general idea is right.  It is a set of rules and rating functions that
articulate what is good about a good navigation tree.   The navigation-tree
generation method then scans the parse tree in the DOM and seeks to fill
the slots in the "good navigation tree" model.  Here a "model" or schema is
a generalization of a template.  It creates roles for (references to)
instances from the parse tree (and a few interpolated collector instances)
to populate.
I am desperate to clean up my story for that meeting so that people will
realize the shape of the hypothesized answer.  And that we know enough so
that building a prototype [DAISY consortium is already paying contractors
to do this] and experimenting with real pages is a plausible plan of
attack.  We need the testbed up and running so we can do some evaluation on
various rough ideas and not do all our research on paper.

The idea is that the "navigation tree" is a virtual table of contents like
the navbar generated by the Microsoft PowerToy.  It is a dynamic structure
under user control, both on a construction-rule basis and on an
instance-by-instance basis.  The conventional wisdom housed in heuristics
or AI generates a rough estimate of what the navigation tree should be.

In the talking book scenario, the editor of the adapted editions then can
hand-edit the navigation tree.  The author tool maintains the
cross-relation between the edited navigation tree and the parse tree for
the full contents, however.  This is an overlaid index, subspecies index to
contents, subspecies hand-tailored to optimize based on a mix of
machine-computable and human-assessment quality factors.

I am convinced that we already know enough to build one of these which
would be a big step forward from the raw parse tree.   


The central issue is the duration of document parts in nominal play time.
How fine or coarse to divide things has to be informed by this metric, and
the visual author will need help in dealing with this performance axis.
But a pie chart or related graphical depiction of how nominal play time is
distributed through the "current draft navigation tree" is something easy
to compute and something it is easy for a visual author to respond
appropriately to.  So we need to get on with it.  In FLORA the method that
nominates a draft navigation tree is a _very_ small program.  You would
spend most of your time debugging FLORA, not writing the tree transformer.
[In XSLT you could do a lot of rewriting and get tired and want to freeze
the tool before you had really learned the right rules...]

Please join me in preparing homework for the Apr 27 telecon on structural
navigation, jointly hosted by UA and GL.

Al

>
>Len
>
>
>
>
>
>--
>Leonard R. Kasday, Ph.D.
>Institute on Disabilities/UAP, and
>Department of Electrical Engineering
>Temple University
>423 Ritter Annex, Philadelphia, PA 19122
>
>kasday@acm.org
>http://astro.temple.edu/~kasday
>
>(215) 204-2247 (voice)
>(800) 750-7428 (TTY)
>
Received on Friday, 21 April 2000 11:24:40 UTC