Re: Machine-Readable Performance from Leonard Rosenthol on 2020-02-21 (public-credibility@w3.org from February 2020)

From: Leonard Rosenthol <lrosenth@adobe.com>
Date: Fri, 21 Feb 2020 00:42:21 +0000
To: Owen Ambur <Owen.Ambur@verizon.net>, "public-credibility@w3.org" <public-credibility@w3.org>
Message-ID: <5EC0CD0C-FF84-4696-8C29-76D42ED274A9@adobe.com>
That's an unfortunate and limited definition of machine readable format.

From: Owen Ambur <Owen.Ambur@verizon.net>
Date: Thursday, February 20, 2020 at 6:20 PM
To: "public-credibility@w3.org" <public-credibility@w3.org>
Subject: Machine-Readable Performance
Resent-From: <public-credibility@w3.org>
Resent-Date: Thursday, February 20, 2020 at 6:20 PM


Annette, OMB Circular A-11<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.whitehouse.gov%2Fwp-content%2Fuploads%2F2018%2F06%2Fa11.pdf&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C637178376289523106&sdata=IDorUB2Si40A%2F0rnVItsSVxkgq%2Bv%2BApM4uoIBYEp4Fk%3D&reserved=0> provides the following definition:

Machine Readable Format. Format in a standard computer language (not English text) that can be read automatically by a web browser or computer system. (e.g., xml). Traditional word processing documents, hypertext markup language (HTML) and portable document format (PDF) files are easily read by humans but typically are difficult for machines to interpret. Other formats such as extensible markup language (XML), (JSON), or spreadsheets with header columns that can be exported as comma separated values (CSV) are machine readable formats. It is possible to make traditional word processing documents and other formats machine readable but the documents must include enhanced structural elements.
Unfortunately, Circular A-11 has been published in PDF so I cannot directly reference that provision.  However, I have converted part of it to StratML format so I can point directly to this key section<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstratml.us%2Fcarmel%2Fiso%2FA11-240wStyle.xml%23_aada4d70-a90a-11e9-a082-d69e4d1a686a&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C637178376289523106&sdata=ZiuyKkuBT2WxD3coeLkVmsxhtpMil%2F8NtxasxG50lW8%3D&reserved=0>, in which OMB has finally committed to piloting implementation of section 10 of GPRAMA with a few agencies this fiscal year.

Yes, technically speaking, GPRAMA only applies to Federal agencies.  However, that is a poor and bureaucratic excuse for failing to apply a good practice, particularly when public funding is involved.  The OPEN Government Data Act (OGDA<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fpulse%2Fopen-gov-data-act-machine-readable-records-owen-ambur%2F&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289533103&sdata=a5B0PwWHcOfKtZUIPkGQGA7h07jUpEggeDw7JbU6eAs%3D&reserved=0>) extends that good practice to all government records.  Moreover, the Grant Reporting Efficiency and Agreements Transparency Act<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.congress.gov%2Fbill%2F116th-congress%2Fhouse-bill%2F150&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289533103&sdata=qwWs2oZv4m8mwZyzuQHTalO0aMy4Upv0UYtxzMeNNv8%3D&reserved=0> of 2019 requires the establishment and use of data standards for information reported by recipients of federal grants.  Why should the national labs be excluded?  Should they not instead be helping to lead the way?

A case can also be made that contracts should be written as performance plans.  https://en.wikipedia.org/wiki/Performance-based_contracting<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FPerformance-based_contracting&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289543099&sdata=ytweVK7OeEVbjRapARGkhRgTnYd0JCrJxyZVXMxAykw%3D&reserved=0>  StratML Part 2<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstratml.us%2F%23Part2&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C637178376289543099&sdata=7CRiWhEOPD7O6ImGGnCPgCRuJNAlY%2FPEluTMAPqlDgo%3D&reserved=0> is an XML vocabulary and schema for performance plans and reports.

What is credibility if not a measure of the quality of performance?  Without such data, isn't anyone's opinion (belief) as good as anyone else's?  https://en.wikipedia.org/wiki/Credibility<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FCredibility&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289553090&sdata=OqucB8dKHT96tvi2nL4ND3bZEpa3Xsi09cA7%2Fb449lY%3D&reserved=0>  Short cuts may lead to predictably bad results.

See also https://en.wikipedia.org/wiki/Consequentialism<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FConsequentialism&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289553090&sdata=76OQ38XWdIMrFS99THzHSoplb7aealplXblqCoa0oIw%3D&reserved=0> versus https://en.wikipedia.org/wiki/Deontological_ethics<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FDeontological_ethics&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289563087&sdata=SRWHrWBTW2xJK2eqXW%2FEZJjiit%2BHE9dYqcc4jCrVNs4%3D&reserved=0>  As the saying goes, the road to hell<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FThe_road_to_hell_is_paved_with_good_intentions&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289563087&sdata=ODdTSukjpm4gr9c5ZuYZAUCz0Ll6CWxM2xjuQmXZRiw%3D&reserved=0> ... (and some politicians seem hell-bent on taking us there).

As unlikely as it may be, it appears Congress is far ahead of "many contractors" and perhaps most laureates in recognizing the importance of open, standard, machine-readable documents (records).

Owen

On 2/20/2020 4:55 PM, Annette Greiner wrote:

Unfortunately, the GPRAMA is unlikely to lead government contractors like the national labs to make their performance reports available in truly machine-readable formats, as (a) it applies to agencies, not contractors, and (b) it does not specify a definition of machine-readable. For many people, PDF counts as "searchable and machine readable", and indeed many of the contractors already meet that bar. From what I can see, the GPRAMA really doesn't do more than require that information about the planning of the government itself be made available to humans via computers. It's a good step, but to my mind at least, it doesn't exemplify particularly tech-savvy legislation. I don't see it as a means to glean reliable credibility signals.

-Annette
On 2/19/20 4:41 PM, Owen Ambur wrote:

Point well taken, Annette.  Beyond peer recognition however, it would be good to make salient the underlying performance indicators specifying what excellence truly means.

In the case of U.S. federal agencies, section 10<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fpulse%2Fopen-machine-readable-government-owen-ambur%2F&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289563087&sdata=2csrY66vp%2FYUCkcCp0RN%2FSbzuvudDYBQQ04fIRYo2as%3D&reserved=0> of the GPRA Modernization Act (GPRAMA) requires them to publish their performance reports in machine-readable format.  It would be good if some of the laureates associated with DOE and LBNL could help lead the way.

In the meantime, on their behalf, I have published their strategic plans in open, standard, machine-readable StratML format at https://stratml.us/drybridge/index.htm#DOE<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstratml.us%2Fdrybridge%2Findex.htm%23DOE&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C637178376289573082&sdata=FUFqm8f2K9Yo6aOtzxn7zq0Iv%2FWdZy9AXIJK2hDb3Qc%3D&reserved=0>

Perhaps someday news organizations will be held accountable not only for doing likewise but also paying greater deference to reliable data than to story telling based so heavily on personal perspectives.  If not, more of what we already see is what we are likely to get, both literally as well as figuratively.

BTW, here's OKF's data journalism guide in StratML format:  https://stratml.us/carmel/iso/DJH5MFGwStyle.xml<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstratml.us%2Fcarmel%2Fiso%2FDJH5MFGwStyle.xml&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C637178376289573082&sdata=wQyN5%2B7vXT8czwlpidt7ccFwpkRA%2B%2FT2xN7SaE2uzdg%3D&reserved=0>  Unfortunately, it says noting about the Foundations for Evidence-Based Policymaking Act (FEBPA<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstratml.us%2Fdrybridge%2Findex.htm%23FEBPA&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C637178376289583080&sdata=JFXZiM0jJBFGoYVzc1wQJ94o%2BO%2BYon%2Bbs6kMjlXeSGQ%3D&reserved=0>), including Title II, the OPEN Government Data Act (OGDA<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linkedin.com%2Fpulse%2Fopen-gov-data-act-machine-readable-records-owen-ambur%2F&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289583080&sdata=6BoPqP8nzVjeiV%2FdDlLsCOPFvL8w%2BbWlcKatz7UavWQ%3D&reserved=0>).

It is ironic that Congress, which is held in such low regard, seems to be so far ahead of the news media, the "knowledge" community, and the W3C in recognizing the importance of open, standard schema-compliant, machine-readable public records.  https://en.wikipedia.org/wiki/Machine-readable_document<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2FMachine-readable_document&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289593066&sdata=wZpGpZpr%2FV0QTPQ%2FDugjKo%2Bhh3O2nsmymQKqlndw9Kg%3D&reserved=0>

Owen
On 2/19/2020 6:59 PM, Annette Greiner wrote:

One of the things that the awards idea makes me think about is evaluating not just a site but the organization that publishes it. Scientific organizations don't get journalism awards, but their researchers may well get prestigious scientific awards, like Nobel Prizes and Fields Medals. I work at a lab that's pretty conspicuous for its Nobels, so I don't want to emphasize that more than it deserves, but in general I want to make sure this list doesn't end up only making sense for journalistic sites.

-Annette
On 2/19/20 9:21 AM, Sandro Hawke wrote:

On 2/19/20 11:48 AM, Sastry, Nishanth wrote:
Hello Sandro, all,

This just a quick email to introduce myself as a new member to the group, from King’s College London. I had applied to the credible web WG several months back, but got approved by our University contact just days before, and have since been added to this email list.

We have done a bunch of work looking at

  1.  hyper partisan websites, in the context of the US Presidential elections:

  *   https://nms.kcl.ac.uk/nishanth.sastry/publication/nrswww-2018-b/<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnms.kcl.ac.uk%2Fnishanth.sastry%2Fpublication%2Fnrswww-2018-b%2F&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C637178376289593066&sdata=JK0EuDtoJxZKVDzlZT1bZt23Rmt4YotcqmTfHHik5lw%3D&reserved=0>

     *   This provided inputs for a major expose by Buzzfeed News: https://www.buzzfeednews.com/article/craigsilverman/inside-the-partisan-fight-for-your-news-feed<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.buzzfeednews.com%2Farticle%2Fcraigsilverman%2Finside-the-partisan-fight-for-your-news-feed&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289603059&sdata=DcYzTysP8thY%2FoTRl4xLXO%2BWdC6O8x9cBjpwMB7lcAo%3D&reserved=0>

  *   https://nms.kcl.ac.uk/nishanth.sastry/publication/nrswww-2020/<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnms.kcl.ac.uk%2Fnishanth.sastry%2Fpublication%2Fnrswww-2020%2F&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C637178376289603059&sdata=hTKxPewE961qAsvUV8xDmexbKzQihYDw9hLAErlYz2g%3D&reserved=0>

     *   Showing that right leaning sites track more intensely than left leaning sites (Covered by WIRED: https://www.wired.com/story/right-left-news-site-ad-tracking/<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.wired.com%2Fstory%2Fright-left-news-site-ad-tracking%2F&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289613055&sdata=ZQ5xNvmdHRKrknIqukYDLeXErXMO9UCchauJX6qmGB0%3D&reserved=0>)

  1.  bias in news and social media during political crises

  *   https://nms.kcl.ac.uk/nishanth.sastry/publication/karamshuk-16-slant/<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnms.kcl.ac.uk%2Fnishanth.sastry%2Fpublication%2Fkaramshuk-16-slant%2F&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289613055&sdata=eP6EbOgils71F66%2F3txTPhipJnOOF%2BlrGeVAMQ9%2BGTM%3D&reserved=0>

  1.  And finally, on transferring trust across domains (which is very aligned with what I see in the signals draft. We also use age as an “ungameable” signal to transfer trust across domains. We do this for IDs of individuals rather than domains, but the paper develops ways to calibrate trust, answering questions such as – is a 10 year-old Facebook ID more trustworthy than a 15 year old Gmail ID, for example):

  *   https://nms.kcl.ac.uk/nishanth.sastry/publication/nr-swww-16/<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fnms.kcl.ac.uk%2Fnishanth.sastry%2Fpublication%2Fnr-swww-16%2F&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637178376289613055&sdata=kv4%2BdppWpmFudJFWoRv2cibf1KgfmruuK61kH4RiaRc%3D&reserved=0>

Very nice.  I'd love to get into signals about individuals, but we it looked like websites would be a little simpler, and we wanted to start in the simplest possible place.  Hopefully we can get into such things fairly soon.



  *

I will join the Zoom at 7pm GMT, and can add any further details that may be interesting to the group. Looking forward.

Great, looking forward to meeting you.  This meeting will be mostly about wrapping up this little sprint, but then hopefully we can expand a bit for the next phase.

     -- Sandro


Best wishes
nishanth


From: Sandro Hawke <sandro@w3.org><mailto:sandro@w3.org>
Date: Wednesday, 19 February 2020 at 15:51
To: Credible Web CG <public-credibility@w3.org><mailto:public-credibility@w3.org>
Subject: journalism award signals
Resent from: <public-credibility@w3.org><mailto:public-credibility@w3.org>
Resent date: Wednesday, 19 February 2020 at 15:51

I did a bit more work on the Journalism Awards, framing it as a general signal and one more specific signals.

I put them into the "reviewed signals" draft, marked as "pending".

Here's a dated version of that draft: https://credweb.org/reviewed-signals-20200219/<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcredweb.org%2Freviewed-signals-20200219%2F&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C637178376289623048&sdata=4nqUTZMq2hI3adJ55wq9tLTPuerLqPSjOGzstKpnlmA%3D&reserved=0>  (The undated version presumably wont show them as pending after today, which could confuse someone reading this later.)

Meeting in about 3 hours, as usual.   Agenda<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fdocument%2Fd%2F1-KcB121I6D6J2ZdQET-qatqCaqv3ttlZkfhgyWEk7nM%2Fedit&data=02%7C01%7Clrosenth%40adobe.com%7Cb5bdaec5715e4b05cd9408d7b65b77c8%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C637178376289623048&sdata=bExnK5z1O0pLU%2Bd5bUMdlblmzF4FM9qNy2ov6YpoHZY%3D&reserved=0>.

       -- Sandro





--

Annette Greiner (she)

NERSC Data and Analytics Services

Lawrence Berkeley National Laboratory



--

Annette Greiner (she)

NERSC Data and Analytics Services

Lawrence Berkeley National Laboratory
Received on Friday, 21 February 2020 00:42:39 UTC