W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > February 2015

Re: Updated version of the BP document

From: Annette Greiner <amgreiner@lbl.gov>
Date: Thu, 12 Feb 2015 14:34:27 -0800
Cc: "public-dwbp-wg@w3.org" <public-dwbp-wg@w3.org>
Message-Id: <F1BE2EF7-5889-4C0E-8352-AEADED4BD6F3@lbl.gov>
To: Bernadette Farias Lóscio <bfl@cin.ufpe.br>
I have a few notes about the versioning and formats sections.

The first data versioning BP links to a brief schema.org discussion about how a schema will be versioned. I think this is intended as evidence that versioning is the subject of debate, but it doesn’t seem to me a very relevant example. It’s not about versioning data, and it’s not much of a debate. Even if we feel that versioning data is somehow highly debatable, I don’t think it helps the reader to be told that, at least not in a vague way that suggests versioning may not be a good idea. I suggest we remove that sentence. There is also a reference in that same section to the Vocabulary Versioning BP for more on assigning stable URIs, but that BP doesn’t say anything about that topic that isn’t already said in the data versioning BP, so I would suggest removing that reference as well.

In the version history BP, we say "It should be possible for data consumers to understand how the data typically changes from version to version.” I would like to add “and how any two specific versions differ.”

In the introduction to Data Access, we say "For all data on the Web, APIs should be available…” I definitely want to encourage use of APIs, but I don’t think we can say that all data should be made available that way. Many datasets are too small or of interest to too few people to make setting up an API worthwhile.

The first Data Formats BP still has old text in the Intended Outcome that should have been removed. “A machine must be able to :” and the three numbered items below it should be deleted. 

For the BP about providing data in multiple formats, I’d like to add the word “consumer” in the Why section, so that it reads "Providing data in more than one format reduces consumer costs incurred in data transformation."

The introduction to the Data Formats section doesn’t match the BPs in that section very well anymore. I think what is there now is just leftover from placeholder text. How about if we replace it with something like this?

"The formats in which data is made available to consumers are a key aspect of making that data usable. The best, most flexible access mechanism in the world is pointless unless it serves data in formats that enable use and reuse. Below we detail best practices in selecting formats for your data, both at the level of files and that of individual fields. W3C encourages use of formats that can be used by the widest possible audience and processed most readily by computing systems. Source formats, such as database dumps or spreadsheets, used to generate the final published format, are out of scope. This document is concerned with what is actually published rather than internal systems used to generate the published data.”

-Annette

--
Annette Greiner
NERSC Data and Analytics Services
Lawrence Berkeley National Laboratory
510-495-2935

On Feb 12, 2015, at 9:45 AM, Bernadette Farias Lóscio <bfl@cin.ufpe.br> wrote:

> Hi all,
> 
> In the last weeks we've been working on the BP document and we have an updated version available in [1].
> 
> We made a lot of changes in the metadata section trying  to solve some of the problems identified during the reviewing process: some sections (Data Provenance, Data Quality, Data License and Data Versioning) were merged with the metadata section and  some metadata best practices were removed.
> 
> We also removed the Data lifecycle section and made changes on the Provide Unique Identifiers BP, but more improvements are needed. Other minor changes were made throughout the whole document.
> 
> Looking forward to have your feedback!
> 
> cheers,
> Bernadette, Caroline e Newton
> 
> [1] https://github.com/bernafarias/dwbp/blob/gh-pages/bp.html
> 
> -- 
> Bernadette Farias Lóscio
> Centro de Informática
> Universidade Federal de Pernambuco - UFPE, Brazil
> ----------------------------------------------------------------------------
Received on Thursday, 12 February 2015 22:36:08 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 12 February 2015 22:36:09 UTC