W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > November 2014

Re: Table of Contents DWBP

From: Christophe Guéret <christophe.gueret@dans.knaw.nl>
Date: Wed, 5 Nov 2014 15:42:53 +0100
Message-ID: <CABP9CAG_V3dCA-20TpVPmPW5Mrefja3-Fu_yKzAHUtyNARr5OA@mail.gmail.com>
To: "Manuel.CARRASCO-BENITEZ@ec.europa.eu" <Manuel.CARRASCO-BENITEZ@ec.europa.eu>
CC: "bfl@cin.ufpe.br" <bfl@cin.ufpe.br>, Christophe Gueret <christophe.gueret@dans.knaw.nl>, "aisaac@few.vu.nl" <aisaac@few.vu.nl>, "public-dwbp-wg@w3.org" <public-dwbp-wg@w3.org>
Hoi Tomas,

Yes, we already discussed that. This IETF document rightly describes
challenges archives face but this is, I agree, out of our scope. For
instance, the fact that preferred serialisation formats evolve over time
and that what is preserved now as JSON-LD could be asked as something else
in 20 years from now is none of our business (IMHO). But still we could
emit some recommendations on how to best ship data to an archive so that we
make their job easier. We could also have some recommendations about what
an archive should do to make the data easier to find and re-use. Regarding
the last bit, indexing all the subjects in a given data set could be a good
thing to do as future data consumers are likely to look for a preserved
description of some URI in particular...

The URI patterns in 7.5 are ok but this section could be revised to allow
for other codes than only 301 and also to have examples using the
persistent identifier (URI preferably) assigned to a given data dump at
ingest time.

Regards,
Christophe




BTW there is a typo in "Redirectio services", section 7.4 of
http://dragoman.org/comuri.html#nature-of-the-resources

On 4 November 2014 15:29, Manuel.CARRASCO-BENITEZ@ec.europa.eu <
Manuel.CARRASCO-BENITEZ@ec.europa.eu> wrote:

> Dear all,
>
> I previously commented, data preservation is a complex issued and a IETF
> WG worked for a few year:
>   Long-Term Archive Service Requirements -
> http://www.ietf.org/rfc/rfc4810.txt
>
> It is discussed in COMURI, though data preservation itself is out of
> scope: perhaps I am too aware of the complexities :-)
>
> A few extracts -   http://dragoman.org/comuri.html#nature-of-the-resources
>
>      7. Nature of the resources
>
>      7.1 Ultrapersistent URI
>      URI creation has to take into account all identification scenarios:
> original site, archival sites, and offline data;
>
>      7.5 Data archival
>     The following data archiving techniques are considered:
>          Online archival sites
>          Offline archival
>         Pack
>
> Regards
> Tomas
>
> From: Bernadette Farias Lóscio [mailto:bfl@cin.ufpe.br]
> Sent: Friday, October 24, 2014 4:32 PM
> To: Christophe Guéret
> Cc: Antoine Isaac; public-dwbp-wg@w3.org
> Subject: Re: Table of Contents DWBP
>
> Hello Cristophe,
>
> Thanks for your feedback!
>
> Just to let me know if I understood your point... Are you proposing to add
> Data Archival as new phase on the life cycle?
>
> I think it is also important to discuss the difference between Data
> Preservation and Data Archival. Could you please let me know what is your
> understanding about these two concepts?
>
> Thank you!
> Bernadette
>
> 2014-10-24 11:13 GMT-03:00 Christophe Guéret <
> christophe.gueret@dans.knaw.nl>:
> Dear Caroline, all,
> Looking at the documents, I'd like to suggest to add a section "Data
> archival" in the "Best Practices Themes (challenges)" of [1]. Then the two
> points
> • Preservation
> • Data versioning
> currently found in data publication could be moved there.
>
> The idea is there is that publication and archival are two different
> phases of the work-flow.
> We can add versioning to the later arguing that one would like to preserve
> previous versions and only serve the latest.
>
> Christophe
>
> [1] https://www.w3.org/2013/dwbp/wiki/Proposed_structure
>
> On 24 October 2014 16:04, Antoine Isaac <aisaac@few.vu.nl> wrote:
> Dear Caroline, all,
>
> As requested in today's call, I had a brief look at the "Draft of the
> content structure of the Best Practices Themes" and "Description of each
> theme on the Table of Contents" at [1].
> I see that
>
> There is the section on controlled vocabularies, that is being worked on
> by Mark and I and mentioned in the previous content list at [2]. Is it
> intentionally left out?
>
> In case not, I think it could be in the "data organization" section of the
> proposed structure at [1].
>
> Kind regards
>
> Antoine
>
> [1]https://www.w3.org/2013/dwbp/wiki/Proposed_structure
> [2]https://www.w3.org/2013/dwbp/wiki/Main_Page#Best_Practices
>
>
> On 10/21/14 10:13 PM, Caroline Burle wrote:
> > Ghislain,
> >
> > thank you for your comments, the suggestion to add different
> questions/issues is very welcome! In fact, Bernadette, Newton and I had a
> 2h call yesterday and put some of the questions you suggested on the
> Proposed Structure[1].
> >
> > Phil Archer also gave the input to add “Feedback” as an item, so the
> Data on the Web Lifecycle would be actually a cycle. This is on the Wiki
> also.
> >
> > Furthermore, we added on the TPAC Goals[2]:
> > Description of each theme on the Table of Contents
> > Draft of the content structure of the Best Practices Themes
> >
> > Kind regards,
> > Caroline
> >
> > [1]
> https://www.w3.org/2013/dwbp/wiki/Proposed_structure#Mapping_of_Themes.5B1.5D
> > [2] https://www.w3.org/2013/dwbp/wiki/TPAC_2014
> >
> >
> > Em 17/10/14 10:00, Ghislain Atemezing escreveu:
> >> Hi Caroline,
> >> Thanks for this starting document for BP document structure.
> >>
> >>> Bernadette and I edited the TPAC Deliverable Goals.
> >>>
> >>> We have also edited the Proposed Structure of the BP document [2]. We
> have only started discussing the Table of Contents, but it would be great
> if you may take a look and make comments.
> >>
> >> I would suggest to add for each item different questions /issues that
> we might address to be sure that we capture all the requirements.
> >> Find below a first attempt of what I mean..
> >>
> >> ###################################
> >>  1- Data Publisher
> >>          Metadata: What are the minimum metadata to describe a dataset?
> >>             Licenses: How to identify licenses suitable to a dataset ?
> >>             Data quality: How can publishers monitor qualities of their
> datasets?
> >>             Provenance: What type of provenance information to attach
> at metadata level ? Discuss the granularity of PROV data: either meta of
> fine-grained level.
> >>         Interoperability
> >>             What makes a good interoperable datasets ?
> >>       Data access: How to decide if to publish a dump versus API
> options (SPARQL, etc) ? What requirements to take into account ?
> (reliability, time to query dataset, etc.)
> >>             Data formats: Advices for types of formats to publish
> dataset (at least 1/2 stars compliance ? )
> >>             Data granularity: How to publish catalog versus otro type
> of data ?
> >>         Sensitive data (privacy) : How to identify them? Are they worth
> publishing? What are the security mechanisms? Licenses ?
> >>         Data identification: How could a publisher add identify related
> dataset for interconnection or reuse?
> >>         Persistence (data identification?): What rules to take into
> account when releasing a dataset:
> >>             URI ?
> >>             Status of the dataset ?
> >>             Disclaimer ?
> >>             etc..
> >>         Data versioning: How to make a describe a track the versions of
> dataset ?
> >>
> >>
> >>     2- Data Consumer
> >>         Data usage : Models to annotate different applications of
> datasets (e.g., data visualizations, data summarization, data republishing,
> )
> >>
> >> ######################
> >>
> >> WDYT ?
> >>
> >> Cheers,
> >>
> >> Ghislain
> >
>
>
>
> --
> Onderzoeker
> +31(0)6 14576494
> christophe.gueret@dans.knaw.nl
>
> Data Archiving and Networked Services (DANS)
> DANS bevordert duurzame toegang tot digitale onderzoeksgegevens. Kijk op
> www.dans.knaw.nl voor meer informatie. DANS is een instituut van KNAW en
> NWO.
>
> Let op, per 1 januari hebben we een nieuw adres:
> DANS | Anna van Saksenlaan 51 | 2593 HW Den Haag | Postbus 93067 | 2509 AB
> Den Haag | +31 70 349 44 50 | info@dans.knaw.nl | www.dans.knaw.nl
>
> Let's build a World Wide Semantic Web!
> http://worldwidesemanticweb.org/
>
> e-Humanities Group (KNAW)
>
>
>
>
>
> --
> Bernadette Farias Lóscio
> Centro de Informática
> Universidade Federal de Pernambuco - UFPE, Brazil
>
> ----------------------------------------------------------------------------
>



-- 
Onderzoeker
+31(0)6 14576494
christophe.gueret@dans.knaw.nl

*Data Archiving and Networked Services (DANS)*

DANS bevordert duurzame toegang tot digitale onderzoeksgegevens. Kijk op
www.dans.knaw.nl voor meer informatie. DANS is een instituut van KNAW en
NWO.


Let op, per 1 januari hebben we een nieuw adres:

DANS | Anna van Saksenlaan 51 | 2593 HW Den Haag | Postbus 93067 | 2509 AB
Den Haag | +31 70 349 44 50 | info@dans.knaw.nl <info@dans.kn> |
www.dans.knaw.nl


*Let's build a World Wide Semantic Web!*
http://worldwidesemanticweb.org/

*e-Humanities Group (KNAW)*
[image: eHumanities] <http://www.ehumanities.nl/>
Received on Wednesday, 5 November 2014 14:43:41 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:24:18 UTC