- From: Diane I. Hillmann <metadata.maven@gmail.com>
- Date: Tue, 29 Mar 2011 08:28:39 -0600
- To: Jodi Schneider <jodi.schneider@deri.org>
- CC: Karen Coyle <kcoyle@kcoyle.net>, public-lld <public-lld@w3.org>
On 3/29/11 8:02 AM, Jodi Schneider wrote: > Two sharing issues--about audience and about deduplication--occurred > to me as I was reading Richard Light's post. We need: (1) Mechanisms > to record the audience of descriptions and to deliver the appropriate > description. Customization of records is likely to be needed for a > long time into the future. Audience considerations are always > important, and say, the description may depend on whether the > collection is a children's book library for teachers, or a collection > of children's novels for early readers. Or whether the collection is > aimed at specialist astronomers or in a university general science > collection. (Besides audience this could also depend on other factors, > i.e. on my mobile I'll want a brief description, yet if I have more > screenspace I might want as much info as will fit.) (2) Mechanisms for > ensuring records are not overwritten or destroyed nefariously (when I > think of "one catalog to rule them all" I worry about censorship > becoming easier when there's only one hub for records, and how to > ensure that "lots of copies keep stuff safe"). At the same time we > need to avoid the many downsides which currently accompany > multiplicity and duplication! -Jodi I think these are important issues, but perhaps we should turn these ideas around, and come at them not from the 'top' (e.g., the intention of the data creators), but from the provenance of the creators themselves, that should be part of the statements they provide. For instance, you should be able to determine intended audience from who provided the description, whether (in your example of children's books) the description comes from a publisher or an academic department training teachers. We know from past experience that the data creator's notion of who they're aiming at in terms of audience is necessarily incomplete--they have little idea about the needs of anyone outside their limited context, so depending on them to define target audiences is probably an exercise in futility. As for (2), we should be talking about how unnecessary the whole idea of de-duplication becomes in a world where statement-level data is the norm. The number and diversity of statements is important information when evaluating the usefulness of data, particularly in a machine environment. If you have, for instance, 10 statements about the format of an item and 9 of them agree, is that not useful? The duplication here supports the validity of those 9 statements that agree. And, particularly when we're talking about a world with numerous points of view, accepting all of the available statements as part of an overall description of a resource gives us far more to work with, and if we know where those statements came from, we can provide either a targeted description to a particular set of users or something broader for others. When we look ahead to a world where we are using machines to assist us in interpreting and improving data, surely we should be thinking that more is better? Why, in the current environment where storage is cheap would we seek to delete or overwrite information? Old habits die hard, certainly, but we should be challenging them where we can. Diane
Received on Tuesday, 29 March 2011 14:29:24 UTC