- From: Butler, Mark <Mark_Butler@hplb.hpl.hp.com>
- Date: Fri, 4 Apr 2003 11:42:23 +0100
- To: SIMILE public list <www-rdf-dspace@w3.org>
FYI -----Original Message----- From: David R. Karger [mailto:karger@theory.lcs.mit.edu] Sent: 04 April 2003 06:09 To: mick.bass@hp.com; Mark_Butler@hplb.hpl.hp.com Subject: notes on use cases Under "2.3 Simile Challenges": We would like to support effective human retrieval over our rich metadata environment. No matter how much information we capture in the repository, it can only be useful if users can find it. Given our rich metadata environment, we will need interfaces that let users navigate the repository, browse the metadata, formulate queries, and evaluate the relevance of retrieved information. Under "2.4 Simile Opportunities" Provide a capture point for content-bearing community discussions. Through annotations and other tools assertions about information (such as what is useful, accurate, and/or easy to understand) that are presently lost in a sea of emails and other conversational tools can be permanently bound to the information they discuss. 3.1 "Metadata Augmentation" seems odd to restrict to human beings. Surely agents can also augment metadata. We should add the topic of "Metadata presentation" "It is essential that instance validation be performed to guard against errors and inconsistencies". I question essentialness, and have more of a "best effort" attitude. It will be useful to validate instances, but at the scale we intend to work I doubt we can get it completely right. 3.2.6 I am made nervous by the discussion of inferencing search. This formulation takes a tractable database problem and turns it into an NP-hard or even undecidable computational nightmare. I feel much more secure factoring the problem into "search what's currently explicit in the databse" and "use tools to infer (best effort) new information that can be added to the database. Equivalence does feel like a "safe" specialized form of inference worth giving attention to. 3.2.7 Two issues seem scrambled here. One is "how are things named" and the obvious answer is "URIs". A separate one is "Should there be canonical URIs that can be deduced from what you are looking for"? I believe the answer to the second is no. URIs should be opaque (for example, random to avoid collisions). The process of "figuring out the right URI for something" is a type of search/retrieval problem. Instead of squeezing this search/retreival into a specialized "figure out the URL" task, incorporate it in the standard search framework. Any information used to define a canonical URL can instead be used as metadata on the object, and any knowledge of how to construct the URL can then be turned into a specification of its metadata. 3.2.8 I don't understand planned distinctions between type and class. 3.3.1 Dissemination of information to humans. Current thinking to have an ontology describing how metadata is supposed to be viewed (eg, which is worth people seeing, which only interesting to agents, which uniquely defines object, etc.) I can write text, but perhaps for now will settle for advertising a 2-page document that expressed main ideas: http://haystack.lcs.mit.edu/papers/www2003-ui.pdf More details in http://haystack.lcs.mit.edu/papers/swfat2003.pdf Some of this is already touched upn in section 3.5. Perhaps 3.3.1 should move there? 3.4 Distributed resources adds whole different scope. It opens up a host of nasty problems of course. We could avoid them by limiting our dealing of with distributed metadata to devising a simple block-transfer protocol, getting all the metadata to a single location, and dealing with it there. The metadata might not be fully up to date, but it avoids a lot of trouble. Since even in centralized scenario everything is hard, perhaps we defer distributed search? 4.6 It is definitely important and interesting, but web site ingest feels less within simile scope that all the other use cases---perhaps because of its lack of metadata issues. Seems more like a straight preservation task? 4.7 While human-opinion metadata is mentioned, the bulkd of this item talks about "usage based" metadata like query history. these are quite different-eg, usage based is collected without user noticing, while opinion-metadata needs interface that let user record such.
Received on Friday, 4 April 2003 05:42:36 UTC