Help with Data Preservation BP

Hi all,

We had a lot of discussions about Data Preservation Best Practices, but we
still have some open comments that we should try to solve before the next
publication. For this, we need some help :)

I am copying Christophe Gueret (he was in charge of the Data Preservation
section) on this email and I hope he can help us to resolve these comments
;)


--> Best Practice 28: Assess dataset coverage

Bernadette's comment:

The test of this BP is not a real test.
I think it should be something like this:
"Check if all resources used in the dataset are either already preserved
somewhere or provided along with the  dataset."


--> Best Practice 29: Use a trusted serialization format for preserved data
dumps

Annette's comment:

"If we keep this, it should at least offer JSON as an acceptable example.
JSON is the current overwhelming standard for APIs. This talks about
"sending data dumps for long-term preservation" and "data depositors".
Where are the data being sent? Is it on the Web? The bad example would pass
the How to Test."

Bernadette's comment:

I am not sure if we need this BP. If we are talking about preservation of
Data on the Web, then probably the data is already in a standard
machine-readable format (BP13). In this case, why (or when) do we need this
BP?

--> Best Practice 30: Update the status of identifiers

Annette's comment:

"It's not quite clear what we are suggesting get linked to what. The Why
talks about linking preserved datasets with the original URI. Are we saying
the original URI should continue to point to the preserved dataset? If
that's the case, then what does preservation mean? There is also discussion
of saving snapshots as versions, which seems to me is covered better under
versioning.

We say "A link is maintained between the URI of a resource, the most
up-to-date description available for it, and preserved descriptions." One
link can only join two resources. Should people preserve old descriptions?
Maybe descriptions of older versions are what was meant?

A 410 status only makes sense if there's nothing served at the URI, which
isn't the case if the advice here is followed. 303 seems like a good
option."

kind regards,
Bernadette




-- 
Bernadette Farias Lóscio
Centro de Informática
Universidade Federal de Pernambuco - UFPE, Brazil
----------------------------------------------------------------------------

Received on Tuesday, 26 April 2016 20:35:01 UTC