FW: position paper

FYI, David Karger's position paper 

-----Original Message-----
From: David R. Karger [mailto:karger@theory.lcs.mit.edu] 
Sent: Thursday, November 13, 2003 12:41 AM
To: mick.bass@hp.com
Subject: position paper



Here's a position paper.  It's a bit rushed, due to much pressure from WWW
conference deadline this friday.

Haystack's potential contributions to the demo follow a natural sequence;
they are also somewhat orthogonal to the main ideas presented in the demo.
I'll begin by outlining the contribution stages; then discuss their relation
to the demo.

Stage 1: Display of Objects from any schema

Haystack aims to present information to user.  Given any information in RDF,
Haystack makes creates a best-effort presentation of that information.  In
the absence of any display "hints", Haystack can do no better than
presenting a list of properties and values (ie predicates and objects)
hanging off the subject to be displayed.  But there are many opportunities
to do better.  

On the one hand, Haystack knows better ways to display information from
certain ontologies.  There are many useful presentations of collections, for
example.  Haystack also knows how to present information fitting the Dublin
Core schema, as well as photographs, calendars, and various other "standard"
classes.

More generally, Haystack aims to be easily customizable.  RDF can be created
that explains how certain certain classes of object can best be displayed;
this information can be consumed by Haystack and used to tune the display.
As a trivial example, a human-readable title of a property can be specified
and Haystack will use that title instead of the property's URI when showing
it to a user.  A more sophisticated specification is of an appropriate
"aspect" for some class: a particular set of properties that can be shown
together in order to give a user a good understanding of an object being
displayed.  More generally, a developer can, using a remarkably small amount
of RDF, essentially whip-up an effective schema-specific information
display.  


Stage 2: Associative navigation across schemas.

The demo currently focuses on search, but the success of the web has
demonstrated that humans love finding information by following a chain of
links---associations from one information object to the next. Haystack's UI
supports such associative navigation.  The UI is recursive, meaning that
objects related to the current object can be shown with the object.  Those
visible related objects can then be "clicked to" as in the standard web
browser paragigm.  Cruically, Haystack's recursive UI allows displays of
objects in different schemas to coexist.  Thus, relationships the are
established between objects in different schemata (for example, through
inference) can be shown and traversed by a user.

Stage 3: Search.

Haystack provides some basic search capabilities.  For example, an interface
exists that they want an object of a certain type with given values for
certain properties.  Haystack's associative navigation tools can help a user
to, e.g., browse through the collection of object types in order to decide
what type of object they want to look for, and to fill in desired values by
finding exemplary objects and "cloning" values from them.  Vineet Sinha has
been developing an interface that lets users examine and refine the
collection of results to a given query---without any human annotation of the
schemas involved, Vineet's tool is able to suggest refinements such as
restricting a certain property to a certain value, or finding more objects
"similar" to other objects in the current collection.

Relation to current demo.

The current demo focuses heavily on the use of inference, particularly to
support cross-domain queries.  In doing so, I think that even Demonstrator
Stage 1 jumps right past a set of "Stage 0" demonstrations that are
meaningful.  While inference-supported automation of cross domain searching
is clearly where we want to end up, the Haystack stages listed above
highlight a number of useful points that can be reached without any
inference at all.  Consider, for example, the relatively simple goal of
letting any community create a community corpus.  Even if we don't require
this corpus be linked to other communities' corpora, important challenges
arise in making it easy to present this corpus for search and navigation.
As a next step, even without inference, interlinkages will naturally arise
between different corpora as soon as they start referring to the same object
(e.g., in the demo example, "Frank Lloyd Wright" becomes a connecting
element between items in the two schemata).  Simply being able to browse out
of one schema and into the other via such transit points, while preserving
some degree of user interface consistency, will be useful.  Providing
support for a user to get some information about a schema that they haven't
seen before, so they can understand objects presented with that schema, will
also be important.  A tool such as Karun Bakshi's ontology browser may play
a role here.

In summary, I'd say that (i) Haystack provides tools for display of,
navigation through, and search for items in arbitrary schema, and that
(ii) these tools may be useful independent of the amount of automated schema
translation and other inference supported by the system.  

Received on Thursday, 13 November 2003 12:14:36 UTC