Re: lexical discussions: from Al Gilman on 2001-09-15 (w3c-wai-ig@w3.org from July to September 2001)

From: Al Gilman <asgilman@iamdigex.net>
Date: Sat, 15 Sep 2001 09:12:59 -0400
To: "David Poehlman" <poehlman1@home.com>, "wai-ig list" <w3c-wai-ig@w3.org>
Message-Id: <200109151310.JAA10557158@smtp2.mail.iamworld.net>
At 09:39 PM 2001-09-14 , David Poehlman wrote:
>A site has come to my attention that bears on the discussion we've had
>from time to time concerning developing a lexical pictorial language or
>representation.  It takes a different approach though but provides some
>intresting information.
>the @sign article found here may be of particular interest.
><http://www.herodios.com/>http://www.herodios.com/
>Hands-on Technolog(eye)s
>Touching The Internet
><http://members.home.com/poehlman1/>http://members.home.com/poehlman1/
><mailto:poehlman1@home.com>mailto:poehlman1@home.com
>voice: 301.949.7599
>

AG::

OK, let me give this away.  This is worth an NSF grant, in my opinion.  I had
been dog in the mager hoarding it on that basis.

What you can do with a small matter of programming is a cross between Google
and Atomica.  The joy of this is that it runs on energy from an Internet Game,
and it is symmetrical in delivering words to explain pictures and pictures to
explain words.

The basic resource is a thesaurus which relates words and pictures.  What
words
relate to what pictures and vice versa.  Data mining done with an internet
computing [think SETI@Home] compute resource creates this, with a small elite
upper crust of picture/word ligatures that have been reviewed and endorsed by
a) volunteer and/or b) expert analysts.  The ultra-clever step is that the
volunteers are playing an Internet hosted version of Pictionary, and people
contribute their computer time to the Internet computing pool in order to play
the game.  And then we launch the game on the market with a celebrity charity
game that people can watch on TV on the Web.  But I get ahead of myself.

The Google part is how you refine this raw relation into "what pages or
neighborhoods in the words on the pages are on the basis of what others have
done about them likeliest to be helpful in understanding this picture" and
"what pictures, in terms of what others have done about them (in ways
observable in the web content and clickstream experience) most likely to be
helpful in understanding this word." 

The Atomica part is what you get as "whazzat" explanation of a word for
ALT-clicking on it.  It's that simple.  But you have a preference set that
Atomica understands that says "please explain in pictures, if you can."

The document you get back is a Google-like prioritized top of the hit set in
the heap of word-picture associations.  For those of you who have experienced
Sesame Street, there is an episode form that they use "one of these things is
not like the others."  In that schtick, there are multiple graphic panels that
all but one contain a common element or theme, which is broken in one
instance.  The challenge is for the user to formulate a metapattern hypothesis
in which they determine what the pattern is that is present in all but one. 
This is harder than the Pictionary display that comes from "Picture it for me
Atomica" the way I can imagine it.  The principle behind the "Picture it
for me
Atomica" form of display is "none of these things is not like the others."
The
sense of the word explained is _the_ common theme in _all_ the images
protrayed.  The Google/Atomica logic could be enriched by attempting to ensure
that the word being explained is the _only_ common theme of the four to eight
images shown.  This is algorithmically determinable from the strength of the
picture-word associations if we keep but one real-number-valued [in (0 .. 1)]
strength weight per arc.

There is is.  it takes mobilization of an organized team to pull it off and
maintain the hub resources.  Who wants to make it happen?

Al
Received on Saturday, 15 September 2001 09:10:52 UTC