RE: Proposed text for provenance section

I've found the discussion really interesting... it is one of those issues 
that is at the heart of lots of usability/requirements questions. Namely 
that knowing/predicting if something will be useful in practice, when it 
has never previously been available and thus does not match 
existing/traditional practice is simply impossible - discuss ;-) [is that a 
joke that translates?]

And my experience for one, is that any predictions are as likely as not, to 
be wrong - [e.g. 'who needs to text messages on a stupid small difficult to 
use, mobile phone keyboard when you can speak the the person or leave a 
message?', or 'teachers really want to be able to customise the behaviour 
of the computer program for their students' or ...]

I can think of a few potential uses for the provenance metadata for the 
discriptive metadata e.g. 1) primary research about library 
management/resource processes... pos. leading to change in practices, and 
2)if external members of communities of users are actually inputing or 
editing or annotating discriptive metadata at any stage, but there are 
possibly many others as yet un-thought of - although I make no claims about 
how useful or widely used or not they might be.

I guess the balence point is that of effort for likely benefit(s). In this 
case it seems that there *is* a need to capture provenance metadata for 
preservation metadata and perhaps the same mechanims and systems can be 
used for the discriptive metadata too? if so, for low costs the potential 
value(s) can be explored?

As to a suggestion for the text, how about something like:

At present we have no evidence that those who actually manage archives 
require the ability to track changes to the *descriptive* metadata over 
time.  In traditional library/information management systems logs are kept 
around to track metadata changes temporarily, but it's just not considered 
important to the core mission of managing the \emph{content} over time. 
Schemas change, contexts change, resources get described in myriad ways 
(all at the same time), people make mistakes, fix them, we add stuff, we 
remove stuff, and libraries do not track all this. It is therefore 
difficult to predict the potential value(s) and uses of such data and 
functionality in practice.

However there are some newer kinds of metadata for which the community do 
seem to want to track provenance; namely, preservation metadata, i.e. 
capturing and preserving provenance metadata related to preservation 
activities -- i.e. what was done to the digital object over time in order 
to preserve it.

Paul



--On 07 July 2003 21:43 -0400 MacKenzie Smith <kenzie@MIT.EDU> wrote:

>
> On the metadata provenance question:
>
> -- I stick by my statement that libraries have not traditionally
> tracked provenance for *descriptive* metadata since it's seen
> as very context-sensitive, subject to change, and not of great
> interest for content management over time (the provenance,
> that is, not the metadata itself).
>
> -- However there are some newer kinds of metadata for which
> the community seems to want to track provenance: namely,
> preservation metadata. The new schema from the National
> Library of New Zealand makes a big point of capturing and
> preserving provenance metadata related to preservation
> activities -- i.e. what was done to the digital object over time
> in order to preserve it.
>
> -- Someone (maybe Eric?) made the point that in a world
> where metadata is coming from god knows where and being
> merged together to describe an item it might be nice to know
> where that metadata *came from * (i.e. the metadata source,
> in the strictest sense of the word provenance). I agree with
> that point of view, but question whether it extends to
> *subsequent* changes to the metadata once it's in our
> environment.
>
>> The reason I used these comments (apart from the fact that MacKenzie is
>> the domain expert here) is that they back up a concern of mine:  I'm
>> getting quite concerned about the complexity this "metadata provenance"
>> issue is bringing.  The libraries domain is relatively closed compared
>> to  the Web as a whole.  I just don't think the open Semantic Web
>> scenario of  trawling a tonne of triples from lots of sources and
>> sifting through them  to see which ones you believe is one we have to
>> deal with on this  project.  If a source of complexity can be avoided I
>> think it should be.
>
> Good point. But let's think about the distinctions I'm trying to
> make between *types* of metadata (i.e. descriptive vs. long-term
> management) and *sources* of metadata vs. tracking every
> change over time... different problems, different priorities for
> the data curators...
>
> MacKenzie/
>
>

Received on Tuesday, 8 July 2003 07:05:27 UTC