Fwd: Re: [coders] Last call working drafts for data on the web best practices

Forwarding for documentation of commenter approval.

-Annette



-------- Forwarded Message --------
Subject:  Re: [coders] Last call working drafts for data on the web best 
practices
Date:  Thu, 16 Jun 2016 16:10:29 -0700
From:  David Skinner <deskinner@lbl.gov>
To:  Annette Greiner <amgreiner@lbl.gov>



I like it! Thanks.

On Thursday, June 16, 2016, Annette Greiner <amgreiner@lbl.gov 
<mailto:amgreiner@lbl.gov>> wrote:

    Hi David,

    I took a stab at reworking some of the enrichment BP. Take a look at
    the diffs here and let me know if they address your concerns.

    https://github.com/agreiner/dwbp/commit/ce1b1a8c03cd1b6017f029ad77f41c86f8f9c86e
    (above line 3898)
    https://github.com/agreiner/dwbp/commit/540ed3b236068858936d9a03d7c8218945f609d7

    -Annette
    P.S., I'm hoping to issue a pull request today.


    On 6/3/16 1:37 PM, David Skinner wrote:
>     Hi Annette,
>
>     Most requested is one metric that people easily get, but more
>     broadly it's a value proposition between the data stakeholders.
>     There is not room in the best practices to spell out
>     quantitatively what enrichment is, what are the units, etc. but
>     making data demonstrably better (more valuable) is indeed what I
>     am driving at.
>
>     Since this is a web best practices document it's probably fine to
>     stop there. This issue is important especially for web however as
>     a foothold for collaboration. Stakeholders will want to know how
>     valuable a data set is for logistical and resourcing decisions.
>     Which data is ok on tape? Which data is worth cross-indexing? Etc.
>     Cost-share is also important. If data is valuable to multiple
>     stakeholders they may be able to split the resourcing costs.
>
>     -David
>
>     On Thursday, June 2, 2016, Annette Greiner <amgreiner@lbl.gov
>     <javascript:_e(%7B%7D,'cvml','amgreiner@lbl.gov');>> wrote:
>
>         Hi David,
>
>         Thanks again for doing this! I just wanted to follow up on
>         your question at the end, about enrichment being demonstrable.
>         Am I right in thinking you mean to suggest that the
>         prioritization of enrichments should be driven by what
>         demonstrably adds value to the dataset, e.g., what is most
>         commonly requested by users?
>
>         -Annette
>
>
>         On 6/2/16 12:30 PM, David Skinner wrote:
>>         HI Annette,
>>
>>         First, I'm really impressed. There is some great stuff there.
>>         Are you going to NUFO next month?
>>
>>         I didn't read it all (mostly in 4,6,16,20,29+), but...
>>
>>         1) The best practice topics cover a lot of the areas I think
>>         are important. I did not find much missing. Good coverage.
>>
>>         2) Reading more closely in a couple of sections I have more
>>         interest in I have some suggestions below.
>>
>>         -David
>>
>>
>>         IMO Topic 8.13 is a little too focused on automated methods
>>         for "filling in missing values". I like the summary:
>>
>>         /Enrich your data by generating new data from the raw data
>>         when doing so will enhance its value.
>>         /
>>         but the text does not really address the "enhancement of
>>         value" part. It also seems weighted toward interpolation of
>>         data values as opposed to "generating new data". One way to
>>         get that cross would be to add
>>
>>         /Other examples include visual inspection to identify
>>         features in spatial data and cross-reference to external
>>         databases for demographic information. /[ *Lastly, generation
>>         of new data may be demand-driven, where missing values are
>>         calculated or otherwise determined by direct means. Measured
>>         application of these techniques informs the degree and
>>         direction of data enrichment*]
>>
>>         Do you think it's worth emphasizing that enrichment should be
>>         demonstrable? I see this as a QA issue.
>>
>>
>>
>>
>>         -David
>>
>>         On Fri, May 27, 2016 at 7:02 AM, Annette Greiner
>>         <amgreiner@lbl.gov> wrote:
>>
>>             Hi, folks,
>>
>>             I’ve been heavily involved with the W3C working group for
>>             Data on the Web Best Practices, and we’re at a phase
>>             where it’s important for us to get comments from the
>>             community. These documents should be of interest to
>>             anyone who posts data to the web. We have just published
>>             a last call working draft of our Data on the Web Best
>>             Practices document, the Dataset Usage Vocabulary, and the
>>             Data Quality Vocabulary.
>>
>>             These deliverables are the outcome of two and a half
>>             years of collaborative effort from the Working Group. We
>>             believe the Best Practices document and vocabularies are
>>             complete, and would love to hear your final comments
>>             before they become a W3C Candidate Recommendation (BP
>>             doc) and Working Group Notes (vocabs). We are also eager
>>             to hear how you are implementing, or plan to implement,
>>             the Data on the Web Best Practices.
>>
>>                     • The Data on the Web Best Practices document
>>             offers advice on how data of all kinds – government,
>>             research, commercial – can be shared on the Web, whether
>>             openly or not. The underlying aim is to make data
>>             intelligently available, maximizing the likelihood of its
>>             discovery and reuse. The provision of a variety of
>>             metadata, the use of URIs as identifiers and multiple
>>             access options are key to this.
>>                     • The Dataset Usage Vocabulary offers a framework
>>             in which citations, comments, and uses of data within
>>             applications can be structured. The aim is to benefit
>>             data publishers by enabling assessment of the impact of
>>             their efforts to share data, and to benefit data users by
>>             encouraging the continued availability of data and the
>>             visibility of their own work that uses it.
>>                     • The Data Quality Vocabulary offers a framework
>>             in which the quality of a dataset can be described,
>>             whether by the dataset publisher or by a broader
>>             community of users. It does not provide a formal,
>>             complete definition of quality, rather, it sets out a
>>             consistent means by which information can be provided
>>             such that a potential user of a dataset can make his/her
>>             own judgment about its fitness for purpose.
>>
>>             Please send any comments or examples of how you are using
>>             the Best Practices to public-dwbp-comments@w3.org
>>             <javascript:_e(%7B%7D,'cvml','public-dwbp-comments@w3.org');>
>>             until June 12th. All feedback is welcome and will be
>>             responded to.
>>
>>             We look forward to hearing from you!
>>             -Annette, for the W3C Data on the Web Best Practices
>>             Working Group
>>
>>             https://www.w3.org/2013/dwbp/
>>
>>             --
>>             --
>>             You received this message because you are subscribed to
>>             the Berkeley Lab Coders Group.
>>             To post to this group, send email to coders@lbl.gov
>>             <javascript:_e(%7B%7D,'cvml','coders@lbl.gov');>
>>             To unsubscribe from this group, send email to
>>             coders+unsubscribe@lbl.gov
>>             For more options, visit this group at
>>             http://groups.google.com/a/lbl.gov/group/coders?hl=en
>>
>>
>>
>
>         -- 
>         Annette Greiner
>         NERSC Data and Analytics Services
>         Lawrence Berkeley National Laboratory
>

    -- 
    Annette Greiner
    NERSC Data and Analytics Services
    Lawrence Berkeley National Laboratory



-- 
-David (from my phone)

Received on Thursday, 16 June 2016 23:56:26 UTC