Re: Machine-Readable Performance

I certainly don't mean to argue that the labs or other contractors of 
government agencies shouldn't be attempting to publish information in 
the most open way possible. The issue for this group is whether we can 
depend on that for assessing credibility. At this point in time, I think 
the answer is no.

On 2/20/20 3:19 PM, Owen Ambur wrote:
> Annette, OMB Circular A-11 
> <> 
> provides the following definition:
>     Machine Readable Format. Format in a standard computer language
>     (not English text) that can be read automatically by a web browser
>     or computer system. (e.g., xml). Traditional word processing
>     documents, hypertext markup language (HTML) and portable document
>     format (PDF) files are easily read by humans but typically are
>     difficult for machines to interpret. Other formats such as
>     extensible markup language (XML), (JSON), or spreadsheets with
>     header columns that can be exported as comma separated values
>     (CSV) are machine readable formats. It is possible to make
>     traditional word processing documents and other formats machine
>     readable but the documents must include enhanced structural elements.
> Unfortunately, Circular A-11 has been published in PDF so I cannot 
> directly reference that provision. However, I have converted part of 
> it to StratML format so I can point directly to this key section 
> <>, 
> in which OMB has finally committed to piloting implementation of 
> section 10 of GPRAMA with a few agencies this fiscal year.
> Yes, technically speaking, GPRAMA only applies to Federal agencies.  
> However, that is a poor and bureaucratic excuse for failing to apply a 
> good practice, particularly when public funding is involved.  The OPEN 
> Government Data Act (OGDA 
> <>) 
> extends that good practice to all government records.  Moreover, the 
> Grant Reporting Efficiency and Agreements Transparency Act 
> <> of 2019 
> requires the establishment and use of data standards for information 
> reported by recipients of federal grants.  Why should the national 
> labs be excluded?  Should they not instead be helping to lead the way?
> A case can also be made that contracts should be written as 
> performance plans. 
> StratML 
> Part 2 <> is an XML vocabulary and schema for 
> performance plans and reports.
> What is credibility if not a measure of the quality of performance?  
> Without such data, isn't anyone's opinion (belief) as good as anyone 
> else's? Short cuts may lead 
> to predictably bad results.
> See also versus 
> As the saying goes, 
> the road to hell 
> <> 
> ... (and some politicians seem hell-bent on taking us there).
> As unlikely as it may be, it appears Congress is far ahead of "many 
> contractors" and perhaps most laureates in recognizing the importance 
> of open, standard, machine-readable documents (records).
> Owen
> On 2/20/2020 4:55 PM, Annette Greiner wrote:
>> Unfortunately, the GPRAMA is unlikely to lead government contractors 
>> like the national labs to make their performance reports available in 
>> truly machine-readable formats, as (a) it applies to agencies, not 
>> contractors, and (b) it does not specify a definition of 
>> machine-readable. For many people, PDF counts as "searchable and 
>> machine readable", and indeed many of the contractors already meet 
>> that bar. From what I can see, the GPRAMA really doesn't do more than 
>> require that information about the planning of the government itself 
>> be made available to humans via computers. It's a good step, but to 
>> my mind at least, it doesn't exemplify particularly tech-savvy 
>> legislation. I don't see it as a means to glean reliable credibility 
>> signals.
>> -Annette
>> On 2/19/20 4:41 PM, Owen Ambur wrote:
>>> Point well taken, Annette.  Beyond peer recognition however, it 
>>> would be good to make salient the underlying performance indicators 
>>> specifying what excellence truly means.
>>> In the case of U.S. federal agencies, section 10 
>>> <> 
>>> of the GPRA Modernization Act (GPRAMA) requires them to publish 
>>> their performance reports in machine-readable format.  It would be 
>>> good if some of the laureates associated with DOE and LBNL could 
>>> help lead the way.
>>> In the meantime, on their behalf, I have published their strategic 
>>> plans in open, standard, machine-readable StratML format at 
>>> Perhaps someday news organizations will be held accountable not only 
>>> for doing likewise but also paying greater deference to reliable 
>>> data than to story telling based so heavily on personal 
>>> perspectives.  If not, more of what we already see is what we are 
>>> likely to get, both literally as well as figuratively.
>>> BTW, here's OKF's data journalism guide in StratML format: 
>>> Unfortunately, it 
>>> says noting about the Foundations for Evidence-Based Policymaking 
>>> Act (FEBPA <>), 
>>> including Title II, the OPEN Government Data Act (OGDA 
>>> <>).
>>> It is ironic that Congress, which is held in such low regard, seems 
>>> to be so far ahead of the news media, the "knowledge" community, and 
>>> the W3C in recognizing the importance of open, standard 
>>> schema-compliant, machine-readable public records. 
>>> Owen
>>> On 2/19/2020 6:59 PM, Annette Greiner wrote:
>>>> One of the things that the awards idea makes me think about is 
>>>> evaluating not just a site but the organization that publishes it. 
>>>> Scientific organizations don't get journalism awards, but their 
>>>> researchers may well get prestigious scientific awards, like Nobel 
>>>> Prizes and Fields Medals. I work at a lab that's pretty conspicuous 
>>>> for its Nobels, so I don't want to emphasize that more than it 
>>>> deserves, but in general I want to make sure this list doesn't end 
>>>> up only making sense for journalistic sites.
>>>> -Annette
>>>> On 2/19/20 9:21 AM, Sandro Hawke wrote:
>>>>> On 2/19/20 11:48 AM, Sastry, Nishanth wrote:
>>>>>> Hello Sandro, all,
>>>>>> This just a quick email to introduce myself as a new member to 
>>>>>> the group, from King’s College London. I had applied to the 
>>>>>> credible web WG several months back, but got approved by our 
>>>>>> University contact just days before, and have since been added to 
>>>>>> this email list.
>>>>>> We have done a bunch of work looking at
>>>>>>  1. hyper partisan websites, in the context of the US
>>>>>>     Presidential elections:
>>>>>>   *
>>>>>>       o This provided inputs for a major expose by Buzzfeed News:
>>>>>>   *
>>>>>>       o Showing that right leaning sites track more intensely
>>>>>>         than left leaning sites (Covered by WIRED:
>>>>>>  2. bias in news and social media during political crises
>>>>>>   *
>>>>>>  3. And finally, on transferring trust across domains (which is
>>>>>>     very aligned with what I see in the signals draft. We also
>>>>>>     use age as an “ungameable” signal to transfer trust across
>>>>>>     domains. We do this for IDs of individuals rather than
>>>>>>     domains, but the paper develops ways to calibrate trust,
>>>>>>     answering questions such as – is a 10 year-old Facebook ID
>>>>>>     more trustworthy than a 15 year old Gmail ID, for example):
>>>>>>   *
>>>>> Very nice.  I'd love to get into signals about individuals, but we 
>>>>> it looked like websites would be a little simpler, and we wanted 
>>>>> to start in the simplest possible place.  Hopefully we can get 
>>>>> into such things fairly soon.
>>>>>>  *
>>>>>> I will join the Zoom at 7pm GMT, and can add any further details 
>>>>>> that may be interesting to the group. Looking forward.
>>>>> Great, looking forward to meeting you.  This meeting will be 
>>>>> mostly about wrapping up this little sprint, but then hopefully we 
>>>>> can expand a bit for the next phase.
>>>>>      -- Sandro
>>>>>> Best wishes
>>>>>> nishanth
>>>>>> *From: *Sandro Hawke <>
>>>>>> *Date: *Wednesday, 19 February 2020 at 15:51
>>>>>> *To: *Credible Web CG <>
>>>>>> *Subject: *journalism award signals
>>>>>> *Resent from: *<>
>>>>>> *Resent date: *Wednesday, 19 February 2020 at 15:51
>>>>>> I did a bit more work on the Journalism Awards, framing it as a 
>>>>>> general signal and one more specific signals.
>>>>>> I put them into the "reviewed signals" draft, marked as "pending".
>>>>>> Here's a dated version of that draft: 
>>>>>> <> 
>>>>>> (The undated version presumably wont show them as pending after 
>>>>>> today, which could confuse someone reading this later.)
>>>>>> Meeting in about 3 hours, as usual. Agenda 
>>>>>> <>.
>>>>>>        -- Sandro
>>>> -- 
>>>> Annette Greiner (she)
>>>> NERSC Data and Analytics Services
>>>> Lawrence Berkeley National Laboratory
>> -- 
>> Annette Greiner (she)
>> NERSC Data and Analytics Services
>> Lawrence Berkeley National Laboratory
Annette Greiner (she)
NERSC Data and Analytics Services
Lawrence Berkeley National Laboratory

Received on Friday, 21 February 2020 01:24:45 UTC