Re: Recommendations: specificity from Richard Light on 2011-03-29 (public-lld@w3.org from March 2011)

From: Richard Light <richard@light.demon.co.uk>
Date: Tue, 29 Mar 2011 16:37:06 +0100
To: Jodi Schneider <jodi.schneider@deri.org>
Cc: "Diane I. Hillmann" <metadata.maven@GMAIL.COM>, Karen Coyle <kcoyle@kcoyle.net>, public-lld <public-lld@w3.org>
Message-ID: <Pqfxfe6iyfkNFwd8@light.demon.co.uk>
In message <924EBFCD-7E0F-41E6-A596-D8C4C7CF8091@deri.org>, Jodi 
Schneider <jodi.schneider@deri.org> writes
>Useful thoughts, Diane!
>
>On 29 Mar 2011, at 15:28, Diane I. Hillmann wrote:
>
>> On 3/29/11 8:02 AM, Jodi Schneider wrote:
>>> Two sharing issues--about audience and about deduplication--occurred 
>>>to me as I was reading Richard Light's post. We need: (1) Mechanisms 
>>>to record the audience of descriptions and to deliver the appropriate 
>>>description. Customization of records is likely to be needed for a 
>>>long time into the future. Audience considerations are always 
>>>important, and say, the description may depend on whether the 
>>>collection is a children's book library for teachers, or a collection 
>>>of children's novels for early readers. Or whether the collection is 
>>>aimed at specialist astronomers or in a university general science 
>>>collection. (Besides audience this could also depend on other 
>>>factors, i.e. on my mobile I'll want a brief description, yet if I 
>>>have more screenspace I might want as much info as will fit.) (2) 
>>>Mechanisms for ensuring records are not overwritten or destroyed 
>>>nefariously (when I think of "one catalog to rule them all" I worry 
>>>about censorship becoming easier when there's only one hub for 
>>>records, and how to ensure that "lots of copies keep stuff safe"). At 
>>>the same time we need to avoid the many downsides which currently accompany multiplicity and duplication! -Jodi
>> I think these are important issues, but perhaps we should turn these 
>>ideas around, and come at them not from the 'top' (e.g., the intention 
>>the data creators), but from the provenance of the creators 
>>themselves, that should be part of the statements they provide. For 
>>instance, you should be able to determine intended audience from who 
>>provided the description, whether (in your example of children's 
>>books) the description comes from a publisher or an academic 
>>department training teachers. We know from past experience that the 
>>data creator's notion of who they're aiming at in terms of audience is 
>>necessarily incomplete--they have little idea about the needs of 
>>anyone outside their limited context, so depending on them to define 
>>target audiences is probably an exercise in futility.
>
>So then this becomes:
>- need ways of determining the appropriate description (ideally without 
>the viewer specifying it directly)

But then, what do we mean by "the description"
>>in a world where statement-level data is the norm.
??

Do we invent a higher-level structure within which the statements 
comprising a particular style of description fit, or are we simply 
talking about having multiple rdfs:comment or dbpedia-owl:abstract 
assertions?

Richard

>- need ways of mapping characteristics of describers to characteristics 
>of the appropriate viewers
>
>
>> As for (2), we should be talking about how unnecessary the whole idea 
>>of de-duplication becomes in a world where statement-level data is the 
>>norm.
>
>Ok, then this becomes an issue of detecting when two statements are the same.
>
>> The number and diversity of statements is important information when 
>>evaluating the usefulness of data, particularly in a machine 
>>environment. If you have, for instance, 10 statements about the format 
>>of an item and 9 of them agree, is that not useful?
>
>Sure, but also "the majority is always wrong" (i.e. we need more 
>sophisticated ways to track authenticity, provenance, likely 
>correctness)
>
>> The duplication here supports the validity of those 9 statements that 
>>agree.  And, particularly when we're talking about a world with 
>>numerous points of view, accepting all of the available statements as 
>>part of an overall description of a resource gives us far more to work 
>>with, and if we know where those statements came from, we can provide 
>>either a targeted description to a particular set of users or 
>>something broader for others. When we look ahead to a world where we 
>>are using machines to assist us in interpreting and improving data, 
>>surely we should be thinking that more is better?  Why, in the current 
>>environment where storage is cheap would we seek to delete or 
>>overwrite information?  Old habits die hard, certainly, but we should 
>>be challenging them where we can.
>
>Then tracking what the "best" data is (and having algorithms for 
>detecting it) will be important -- because "most recent" != "best"
>
>-Jodi
>
>>
>> Diane
>
>
>
>
>-----
>No virus found in this message.
>Checked by AVG - www.avg.com
>Version: 10.0.1204 / Virus Database: 1498/3536 - Release Date: 03/28/11
>
>

-- 
Richard Light
Received on Tuesday, 29 March 2011 15:39:37 UTC