W3C home > Mailing lists > Public > public-digipub-ig@w3.org > May 2016

RE: [dpub-arch] ideas for PWP use cases related to archival services

From: Nicholas Taylor <ntay@stanford.edu>
Date: Tue, 24 May 2016 23:32:50 +0000
To: "public-digipub-ig@w3.org" <public-digipub-ig@w3.org>
Message-ID: <CY1PR02MB1723E0E2B463FC5C5F31ABD8B54F0@CY1PR02MB1723.namprd02.prod.outlook.com>
Hi Tim, Ayla (et al):

I made a number of updates to the archiving use cases. The relevant pull request can be found here: https://github.com/w3c/dpub-pwp-arch/pull/1.

A few notes:

·         I didn’t address the question of permissions or otherwise call out the case of components for an individual PWP served from multiple hosts as I reconsidered whether these are in scope.

·         I added comments to the retraction use case. I think it’s a specific example of (and is therefore already account for by) a combination of other use cases.

·         I think the format migration use case could be eliminated or, at least, I’m not sure what role PWP might have here beyond, e.g., providing mime-type information in the manifest.

Of interest to the group, I’d recommend a recent pre-print from Herbert Van de Sompel, David S.H. Rosenthal, and Michael L. Nelson: Web infrastructure to Support e-Journal Preservation (and More) (http://arxiv.org/abs/1605.06154). It’s an elaboration of the previously-shared Signposting the scholarly web idea, which basically points to leveraging ResourceSync and Signposting to support the manifest and range of possible change operations.

Apologies; I will miss the virtual F2F tomorrow.

~Nicholas

From: Heather Flanagan (RFC Series Editor) [mailto:rse@rfc-editor.org]
Sent: Thursday, May 05, 2016 9:47 AM
To: public-digipub-ig@w3.org
Subject: Re: [dpub-arch] ideas for PWP use cases related to archival services

On 5/3/16 3:13 PM, Timothy Cole wrote:
>
> The last few Archival Task Force calls have generated a few ideas for
> PWP use cases related to archival services. Several of these ideas
> are listed below. Please keep in mind that these are only preliminary
> (i.e., not fully baked – in various stages of development), and in
> some cases what we have now can only be thought of as a placeholder.
> In some instances the use case ideas listed below may overlap; in
> other instances the initial idea may conflate multiple use cases.
>
>
>
> Please respond to this email with suggestions for additional
> archival-related use case ideas, as well as with feedback on the
> ideas listed here that will help get us going on Thursday.
>

I am noting after each use case the requirement(s) I think it's suggesting. My questions are whether or not I've captured the requirements correctly, and what might be missing?

>
>
> We will spend most of Thursday's Archival Task Force call refining
> and making these ideas more granular as needed, and developing
> additional use case ideas.  Some of the following have already been
> added preliminarily to our Archival Use Cases page,
> http://w3c.github.io/dpub-pwp-arch/Archival-UCR.html. Here's an
> initial list of archival-related PWP ideas to get us going Thursday
> (more ideas welcome):
>
>
>
> ·         Initial Capture of a PWP by an Archiving Service: An
> archival service wants to harvest (spider) and save a PWP, and
> expects to find in the manifest the enumeration of what it will need
> to capture to make sure it has all the pieces of the PWP that need to
> be archived, even if these pieces reside on separate servers. (What
> does this mean for the design of the PWP manifest?)

In terms of requirements, this provides input into what's required in a manifest. Also, that a PWP does need to be made available as a single, discrete file.

>
> ·         A new Version of a PWP Component is Published, requiring
> partial re-harvesting: An archival service needs to update an
> Archival Information Package (i.e., a previously harvested PWP)
> because a new version of a component of the PWP has been published.
> (This may in fact be multiple use cases, see below.)
>

I think this is multiple use cases, split at least in part the way you have it below.

>
>
> ·         A Revision of a PWP (or PWP Component?) is Published,
> requiring re-harvesting: An archival service needs to update an
> Archival Information Package (i.e., a previously harvested PWP)
> because it or one of its components has been revised, e.g., a
> spelling error corrected.
>

Requirement: some kind of signaling service that will let the archive know that the files must be updated?

>
>
> ·         A Retraction Notice of a PWP or PWP Component is Issued: An
> archival service needs to harvest the retraction notice and replace /
> update /  add to the Archival Information Package for the PWP as
> originally harvested to reflect the Retraction Notice issuance.
>

Requirement: some kind of signaling service that will let the archive know that the files must be changed, plus metadata that indicates the retraction date/cause?

>
>
> ·         A PWP or PWP Component is Taken Down: An archival service
> needs to update an Archival Information Package (i.e., a previously
> harvested PWP) because it or one of its components has been taken
> down by the publisher.
>

What happens if the take down notice comes to the archive and not the publisher?

>
>
> ·         Determining when format migration of a PWP is required. An
> archival service needs to validate that a previously harvested PWP
> and all of its components are still viable in order to determine when
> format migration is required.
>

Definitely needed, but not sure how this translates in to requirements for the PWP. It's more an operational item for the archive.

>
>
> ·         Adding metadata to a PWP to support archiving: A service
> wishes to augment the metadata of a PWP being harvested for archiving
> with additional metadata deemed essential for long-term archiving.
>

Requirement: the PWP needs to allow additional metadata to be added by a third party

>
>
> ·         Migrating metadata format: An archiving service needs to
> migrate the metadata associated with a PWP to a scheme that will
> better make sure the metadata  (as distinct from the content of the
> PWP in this case) can be read and understood in the future.
>

Is this another purely operational item that doesn't touch on requirements for the PWP?
-Heather

Received on Tuesday, 24 May 2016 23:33:24 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:36:27 UTC