- From: Richard Light <richard@light.demon.co.uk>
- Date: Tue, 29 Mar 2011 16:37:06 +0100
- To: Jodi Schneider <jodi.schneider@deri.org>
- Cc: "Diane I. Hillmann" <metadata.maven@GMAIL.COM>, Karen Coyle <kcoyle@kcoyle.net>, public-lld <public-lld@w3.org>
In message <924EBFCD-7E0F-41E6-A596-D8C4C7CF8091@deri.org>, Jodi Schneider <jodi.schneider@deri.org> writes >Useful thoughts, Diane! > >On 29 Mar 2011, at 15:28, Diane I. Hillmann wrote: > >> On 3/29/11 8:02 AM, Jodi Schneider wrote: >>> Two sharing issues--about audience and about deduplication--occurred >>>to me as I was reading Richard Light's post. We need: (1) Mechanisms >>>to record the audience of descriptions and to deliver the appropriate >>>description. Customization of records is likely to be needed for a >>>long time into the future. Audience considerations are always >>>important, and say, the description may depend on whether the >>>collection is a children's book library for teachers, or a collection >>>of children's novels for early readers. Or whether the collection is >>>aimed at specialist astronomers or in a university general science >>>collection. (Besides audience this could also depend on other >>>factors, i.e. on my mobile I'll want a brief description, yet if I >>>have more screenspace I might want as much info as will fit.) (2) >>>Mechanisms for ensuring records are not overwritten or destroyed >>>nefariously (when I think of "one catalog to rule them all" I worry >>>about censorship becoming easier when there's only one hub for >>>records, and how to ensure that "lots of copies keep stuff safe"). At >>>the same time we need to avoid the many downsides which currently accompany multiplicity and duplication! -Jodi >> I think these are important issues, but perhaps we should turn these >>ideas around, and come at them not from the 'top' (e.g., the intention >>the data creators), but from the provenance of the creators >>themselves, that should be part of the statements they provide. For >>instance, you should be able to determine intended audience from who >>provided the description, whether (in your example of children's >>books) the description comes from a publisher or an academic >>department training teachers. We know from past experience that the >>data creator's notion of who they're aiming at in terms of audience is >>necessarily incomplete--they have little idea about the needs of >>anyone outside their limited context, so depending on them to define >>target audiences is probably an exercise in futility. > >So then this becomes: >- need ways of determining the appropriate description (ideally without >the viewer specifying it directly) But then, what do we mean by "the description" >>in a world where statement-level data is the norm. ?? Do we invent a higher-level structure within which the statements comprising a particular style of description fit, or are we simply talking about having multiple rdfs:comment or dbpedia-owl:abstract assertions? Richard >- need ways of mapping characteristics of describers to characteristics >of the appropriate viewers > > >> As for (2), we should be talking about how unnecessary the whole idea >>of de-duplication becomes in a world where statement-level data is the >>norm. > >Ok, then this becomes an issue of detecting when two statements are the same. > >> The number and diversity of statements is important information when >>evaluating the usefulness of data, particularly in a machine >>environment. If you have, for instance, 10 statements about the format >>of an item and 9 of them agree, is that not useful? > >Sure, but also "the majority is always wrong" (i.e. we need more >sophisticated ways to track authenticity, provenance, likely >correctness) > >> The duplication here supports the validity of those 9 statements that >>agree. And, particularly when we're talking about a world with >>numerous points of view, accepting all of the available statements as >>part of an overall description of a resource gives us far more to work >>with, and if we know where those statements came from, we can provide >>either a targeted description to a particular set of users or >>something broader for others. When we look ahead to a world where we >>are using machines to assist us in interpreting and improving data, >>surely we should be thinking that more is better? Why, in the current >>environment where storage is cheap would we seek to delete or >>overwrite information? Old habits die hard, certainly, but we should >>be challenging them where we can. > >Then tracking what the "best" data is (and having algorithms for >detecting it) will be important -- because "most recent" != "best" > >-Jodi > >> >> Diane > > > > >----- >No virus found in this message. >Checked by AVG - www.avg.com >Version: 10.0.1204 / Virus Database: 1498/3536 - Release Date: 03/28/11 > > -- Richard Light
Received on Tuesday, 29 March 2011 15:39:37 UTC