FW: notes on use cases

FYI

-----Original Message-----
From: David R. Karger [mailto:karger@theory.lcs.mit.edu]
Sent: 04 April 2003 06:09
To: mick.bass@hp.com; Mark_Butler@hplb.hpl.hp.com
Subject: notes on use cases



Under "2.3 Simile Challenges":

We would like to support effective human retrieval over our rich
metadata environment.  No matter how much information we capture in
the repository, it can only be useful if users can find it.  Given our
rich metadata environment, we will need interfaces that let users
navigate the repository, browse the metadata, formulate queries, and
evaluate the relevance of retrieved information.

Under "2.4 Simile Opportunities"

Provide a capture point for content-bearing community discussions.
Through annotations and other tools assertions about information (such
as what is useful, accurate, and/or easy to understand) that are
presently lost in a sea of emails and other conversational tools can
be permanently bound to the information they discuss.


3.1

"Metadata Augmentation" seems odd to restrict to human beings.  Surely
agents can also augment metadata.  

We should add the topic of "Metadata presentation"

"It is essential that instance validation be performed to guard
against errors and inconsistencies".  I question essentialness, and
have more of a "best effort" attitude.  It will be useful to validate
instances, but at the scale we intend to work I doubt we can get it
completely right.

3.2.6 

I am made nervous by the discussion of inferencing search.  This
formulation takes a tractable database problem and turns it into an
NP-hard or even undecidable computational nightmare.  I feel much more
secure factoring the problem into "search what's currently explicit in
the databse" and "use tools to infer (best effort) new information
that can be added to the database.

Equivalence does feel like a "safe" specialized form of inference
worth giving attention to.

3.2.7

Two issues seem scrambled here.  One is "how are things named" and the
obvious answer is "URIs".  A separate one is "Should there be
canonical URIs that can be deduced from what you are looking for"?  I
believe the answer to the second is no.  URIs should be opaque (for
example, random to avoid collisions).  The process of "figuring out
the right URI for something" is a type of search/retrieval problem.
Instead of squeezing this search/retreival into a specialized "figure
out the URL" task, incorporate it in the standard search framework.
Any information used to define a canonical URL can instead be used as
metadata on the object, and any knowledge of how to construct the URL
can then be turned into a specification of its metadata.

3.2.8

I don't understand planned distinctions between type and class.

3.3.1 Dissemination of information to humans.

Current thinking to have an ontology describing how metadata is
supposed to be viewed (eg, which is worth people seeing, which only
interesting to agents, which uniquely defines object, etc.)

I can write text, but perhaps for now will settle for advertising a
2-page document that expressed main ideas:
http://haystack.lcs.mit.edu/papers/www2003-ui.pdf
More details in
http://haystack.lcs.mit.edu/papers/swfat2003.pdf

Some of this is already touched upn in section 3.5.  Perhaps 3.3.1
should move there?


3.4

Distributed resources adds whole different scope.  It opens up a host
of nasty problems of course.  We could avoid them by limiting our
dealing of with distributed metadata to devising a simple
block-transfer protocol, getting all the metadata to a single
location, and dealing with it there.  The metadata might not be fully
up to date, but it avoids a lot of trouble.  Since even in centralized
scenario everything is hard, perhaps we defer distributed search?


4.6

It is definitely important and interesting, but web site ingest feels
less within simile scope that all the other use cases---perhaps
because of its lack of metadata issues.  Seems more like a straight
preservation task?

4.7 

While human-opinion metadata is mentioned, the bulkd of this item
talks about "usage based" metadata like query history.  these are
quite different-eg, usage based is collected without user noticing,
while opinion-metadata needs interface that let user record such. 

Received on Friday, 4 April 2003 05:42:36 UTC