Re: Recommendations: specificity from Diane I. Hillmann on 2011-03-29 (public-lld@w3.org from March 2011)

From: Diane I. Hillmann <metadata.maven@gmail.com>
Date: Tue, 29 Mar 2011 08:28:39 -0600
To: Jodi Schneider <jodi.schneider@deri.org>
CC: Karen Coyle <kcoyle@kcoyle.net>, public-lld <public-lld@w3.org>
Message-ID: <4D91EC97.8010809@gmail.com>

  On 3/29/11 8:02 AM, Jodi Schneider wrote:
>  Two sharing issues--about audience and about deduplication--occurred 
> to me as I was reading Richard Light's post. We need: (1) Mechanisms 
> to record the audience of descriptions and to deliver the appropriate 
> description. Customization of records is likely to be needed for a 
> long time into the future. Audience considerations are always 
> important, and say, the description may depend on whether the 
> collection is a children's book library for teachers, or a collection 
> of children's novels for early readers. Or whether the collection is 
> aimed at specialist astronomers or in a university general science 
> collection. (Besides audience this could also depend on other factors, 
> i.e. on my mobile I'll want a brief description, yet if I have more 
> screenspace I might want as much info as will fit.) (2) Mechanisms for 
> ensuring records are not overwritten or destroyed nefariously (when I 
> think of "one catalog to rule them all" I worry about censorship 
> becoming easier when there's only one hub for records, and how to 
> ensure that "lots of copies keep stuff safe"). At the same time we 
> need to avoid the many downsides which currently accompany 
> multiplicity and duplication! -Jodi
I think these are important issues, but perhaps we should turn these 
ideas around, and come at them not from the 'top' (e.g., the intention 
of the data creators), but from the provenance of the creators 
themselves, that should be part of the statements they provide. For 
instance, you should be able to determine intended audience from who 
provided the description, whether (in your example of children's books) 
the description comes from a publisher or an academic department 
training teachers. We know from past experience that the data creator's 
notion of who they're aiming at in terms of audience is necessarily 
incomplete--they have little idea about the needs of anyone outside 
their limited context, so depending on them to define target audiences 
is probably an exercise in futility.

As for (2), we should be talking about how unnecessary the whole idea of 
de-duplication becomes in a world where statement-level data is the 
norm.  The number and diversity of statements is important information 
when evaluating the usefulness of data, particularly in a machine 
environment.  If you have, for instance, 10 statements about the format 
of an item and 9 of them agree, is that not useful? The duplication here 
supports the validity of those 9 statements that agree.  And, 
particularly when we're talking about a world with numerous points of 
view, accepting all of the available statements as part of an overall 
description of a resource gives us far more to work with, and if we know 
where those statements came from, we can provide either a targeted 
description to a particular set of users or something broader for 
others. When we look ahead to a world where we are using machines to 
assist us in interpreting and improving data, surely we should be 
thinking that more is better?  Why, in the current environment where 
storage is cheap would we seek to delete or overwrite information?  Old 
habits die hard, certainly, but we should be challenging them where we can.

Diane

Received on Tuesday, 29 March 2011 14:29:24 UTC