- From: Annette Greiner <amgreiner@lbl.gov>
- Date: Thu, 16 Jun 2016 16:56:03 -0700
- To: DWBP Public List <public-dwbp-wg@w3.org>
- Message-ID: <dc9c5b49-9133-50f2-cb8e-fb241c8c5da8@lbl.gov>
Forwarding for documentation of commenter approval.
-Annette
-------- Forwarded Message --------
Subject: Re: [coders] Last call working drafts for data on the web best
practices
Date: Thu, 16 Jun 2016 16:10:29 -0700
From: David Skinner <deskinner@lbl.gov>
To: Annette Greiner <amgreiner@lbl.gov>
I like it! Thanks.
On Thursday, June 16, 2016, Annette Greiner <amgreiner@lbl.gov
<mailto:amgreiner@lbl.gov>> wrote:
Hi David,
I took a stab at reworking some of the enrichment BP. Take a look at
the diffs here and let me know if they address your concerns.
https://github.com/agreiner/dwbp/commit/ce1b1a8c03cd1b6017f029ad77f41c86f8f9c86e
(above line 3898)
https://github.com/agreiner/dwbp/commit/540ed3b236068858936d9a03d7c8218945f609d7
-Annette
P.S., I'm hoping to issue a pull request today.
On 6/3/16 1:37 PM, David Skinner wrote:
> Hi Annette,
>
> Most requested is one metric that people easily get, but more
> broadly it's a value proposition between the data stakeholders.
> There is not room in the best practices to spell out
> quantitatively what enrichment is, what are the units, etc. but
> making data demonstrably better (more valuable) is indeed what I
> am driving at.
>
> Since this is a web best practices document it's probably fine to
> stop there. This issue is important especially for web however as
> a foothold for collaboration. Stakeholders will want to know how
> valuable a data set is for logistical and resourcing decisions.
> Which data is ok on tape? Which data is worth cross-indexing? Etc.
> Cost-share is also important. If data is valuable to multiple
> stakeholders they may be able to split the resourcing costs.
>
> -David
>
> On Thursday, June 2, 2016, Annette Greiner <amgreiner@lbl.gov
> <javascript:_e(%7B%7D,'cvml','amgreiner@lbl.gov');>> wrote:
>
> Hi David,
>
> Thanks again for doing this! I just wanted to follow up on
> your question at the end, about enrichment being demonstrable.
> Am I right in thinking you mean to suggest that the
> prioritization of enrichments should be driven by what
> demonstrably adds value to the dataset, e.g., what is most
> commonly requested by users?
>
> -Annette
>
>
> On 6/2/16 12:30 PM, David Skinner wrote:
>> HI Annette,
>>
>> First, I'm really impressed. There is some great stuff there.
>> Are you going to NUFO next month?
>>
>> I didn't read it all (mostly in 4,6,16,20,29+), but...
>>
>> 1) The best practice topics cover a lot of the areas I think
>> are important. I did not find much missing. Good coverage.
>>
>> 2) Reading more closely in a couple of sections I have more
>> interest in I have some suggestions below.
>>
>> -David
>>
>>
>> IMO Topic 8.13 is a little too focused on automated methods
>> for "filling in missing values". I like the summary:
>>
>> /Enrich your data by generating new data from the raw data
>> when doing so will enhance its value.
>> /
>> but the text does not really address the "enhancement of
>> value" part. It also seems weighted toward interpolation of
>> data values as opposed to "generating new data". One way to
>> get that cross would be to add
>>
>> /Other examples include visual inspection to identify
>> features in spatial data and cross-reference to external
>> databases for demographic information. /[ *Lastly, generation
>> of new data may be demand-driven, where missing values are
>> calculated or otherwise determined by direct means. Measured
>> application of these techniques informs the degree and
>> direction of data enrichment*]
>>
>> Do you think it's worth emphasizing that enrichment should be
>> demonstrable? I see this as a QA issue.
>>
>>
>>
>>
>> -David
>>
>> On Fri, May 27, 2016 at 7:02 AM, Annette Greiner
>> <amgreiner@lbl.gov> wrote:
>>
>> Hi, folks,
>>
>> I’ve been heavily involved with the W3C working group for
>> Data on the Web Best Practices, and we’re at a phase
>> where it’s important for us to get comments from the
>> community. These documents should be of interest to
>> anyone who posts data to the web. We have just published
>> a last call working draft of our Data on the Web Best
>> Practices document, the Dataset Usage Vocabulary, and the
>> Data Quality Vocabulary.
>>
>> These deliverables are the outcome of two and a half
>> years of collaborative effort from the Working Group. We
>> believe the Best Practices document and vocabularies are
>> complete, and would love to hear your final comments
>> before they become a W3C Candidate Recommendation (BP
>> doc) and Working Group Notes (vocabs). We are also eager
>> to hear how you are implementing, or plan to implement,
>> the Data on the Web Best Practices.
>>
>> • The Data on the Web Best Practices document
>> offers advice on how data of all kinds – government,
>> research, commercial – can be shared on the Web, whether
>> openly or not. The underlying aim is to make data
>> intelligently available, maximizing the likelihood of its
>> discovery and reuse. The provision of a variety of
>> metadata, the use of URIs as identifiers and multiple
>> access options are key to this.
>> • The Dataset Usage Vocabulary offers a framework
>> in which citations, comments, and uses of data within
>> applications can be structured. The aim is to benefit
>> data publishers by enabling assessment of the impact of
>> their efforts to share data, and to benefit data users by
>> encouraging the continued availability of data and the
>> visibility of their own work that uses it.
>> • The Data Quality Vocabulary offers a framework
>> in which the quality of a dataset can be described,
>> whether by the dataset publisher or by a broader
>> community of users. It does not provide a formal,
>> complete definition of quality, rather, it sets out a
>> consistent means by which information can be provided
>> such that a potential user of a dataset can make his/her
>> own judgment about its fitness for purpose.
>>
>> Please send any comments or examples of how you are using
>> the Best Practices to public-dwbp-comments@w3.org
>> <javascript:_e(%7B%7D,'cvml','public-dwbp-comments@w3.org');>
>> until June 12th. All feedback is welcome and will be
>> responded to.
>>
>> We look forward to hearing from you!
>> -Annette, for the W3C Data on the Web Best Practices
>> Working Group
>>
>> https://www.w3.org/2013/dwbp/
>>
>> --
>> --
>> You received this message because you are subscribed to
>> the Berkeley Lab Coders Group.
>> To post to this group, send email to coders@lbl.gov
>> <javascript:_e(%7B%7D,'cvml','coders@lbl.gov');>
>> To unsubscribe from this group, send email to
>> coders+unsubscribe@lbl.gov
>> For more options, visit this group at
>> http://groups.google.com/a/lbl.gov/group/coders?hl=en
>>
>>
>>
>
> --
> Annette Greiner
> NERSC Data and Analytics Services
> Lawrence Berkeley National Laboratory
>
--
Annette Greiner
NERSC Data and Analytics Services
Lawrence Berkeley National Laboratory
--
-David (from my phone)
Received on Thursday, 16 June 2016 23:56:26 UTC