Re: Trust.txt: Why another random .txt when we've got WebFinger and well-known URIs? from Bob Wyman on 2021-08-02 (public-credibility@w3.org from August 2021)

From: Bob Wyman <bob@wyman.us>
Date: Sun, 1 Aug 2021 20:54:01 -0400
To: Scott Yates <scott@journallist.net>
Cc: Credible Web CG <public-credibility@w3.org>
Message-ID: <CAA1s49V_cto8+rYdw06Umiogd1_cLwhCZrHuMF4_PUt76acG+g@mail.gmail.com>
You wrote:

> "If you think trust.txt should be run differently, that's fine, but it's
> not at all related to the CredWeb group."

I believe that CredWeb can and should learn from both the good and bad
design decisions that have been made by all others who are attempting or
have attempted to address the credibility problem. While I may have framed
my comments as criticisms or observations concerning a specific effort,
trust.txt, my intent is not actually to speak to the developers or
advocates of trust.txt itself, but rather to discuss these things in the
context of what CredWeb seeks to do.

A question I've raised before on this list is "Who should publish signals?"
Well, the trust.txt spec gives one answer to that in that it has found it
useful for publishers to make claims about their own services. Also,
trust.txt provides at least one answer to the question: "Where should the
signals be published?" Trust.txt says: "In a web-accessible file, having a
specific format, in a well-known location." These are useful answers to
important questions that I believe should be considered further. Trust.txt
has also answered the question: "What signals are useful?" and it has
produced answers that are very different from those in the CredWeb
documents. CredWeb should consider those additional signals and also wonder
why trust.txt didn't independently identify the same signals that CredWeb
did. (What was different in the approach of problem definition
that motivated the differences in the identified signals?) Nonetheless,
trust.txt's specification includes some details that I think are
non-optimal. There is almost always good with bad, but we can, and should,
learn from both.

I believe it appropriate for this group to analyze a wide variety of
implementations, existing standards, proposals, etc.

bob wyman


On Sun, Aug 1, 2021 at 8:08 PM Scott Yates <scott@journallist.net> wrote:

> Bob,
>
> All due respect, but I think your comments are best addressed to me as the
> founder of JournalList, and not to the entire CredWeb group.
>
> CredWeb has no authority over the trust.txt spec. Similarly, I am not
> running in hopes that trust.txt will take over the CredWeb.
>
> I am running for the chair of the CredWeb in the same way that I might run
> for the school board or a local museum board. I hope to do good work in a
> field that I am interested in, and I have some capacity to contribute. I
> also have a vision of what I think could be a very useful tool for the
> entire spectrum of those fighting disinformation.
>
> If you think trust.txt should be run differently, that's fine, but it's
> not at all related to the CredWeb group.
>
> -Scott Yates
> Founder
> JournalList.net, caretaker of the trust.txt framework
> 202-742-6842
> Short Video Explanation of trust.txt <https://youtu.be/lunOBapQxpU>
>
>
> On Sun, Aug 1, 2021 at 5:20 PM Bob Wyman <bob@wyman.us> wrote:
>
>> You wrote:
>>
>>> "On page 8 of the spec., we encourage the use of "/well-known" so we are
>>> clearly not against that." (link added)
>>
>> The trust.txt spec
>> <https://journallist.net/reference-document-for-trust-txt-specifications>
>> says, on page 8:
>>
>>> "In addition to the access method noted above, use of the “Well Known
>>> Uniform Resource Identifiers” is recommended."
>>
>> So, the spec says that providing the file with a "/.well-known/" prefix
>> is optional and should only be done if the file has also been provided
>> without a prefix. As a result, there is absolutely no utility in having a
>> copy of the file prefixed by "/.well-known/." Any smart coder would simply
>> ignore that there might be a second copy of the file. In fact, one might
>> argue that if a "well-known" file is found, but an unprefixed one is not
>> found, the prefixed copy should be ignored since it may be that the site's
>> intent was to delete the file, and they simply forgot to delete its copy.
>> In any case, it is generally not a good idea, when defining protocols, to
>> require or even recommend that data be provided in more than one place. The
>> typical statement is something like: "If data is found in more than one
>> place, it is probably wrong in all of them..."
>>
>> It would be very useful if the spec could be updated to *require* that
>> only one copy of the file should be provided and that it should be provided
>> with the "/.well-known/" prefix.
>>
>> Also, you wrote:
>>
>>> "The short answer to why we went with a text file is that we are working
>>> with some extremely unsophisticated publishers."
>>
>> I sympathize with your concern for the unsophisticated publisher.
>> However, any difficulty that might exist in the production of a more
>> complex file would be easily overcome by providing a trivial web form that
>> allowed "fill-in-the-blank" simplicity. The produced file could then be
>> simply copied to the appropriate location. After all, we've moved beyond
>> the time when everyone was expected to be able to edit files manually.
>> Publishers deal daily with xml, html, css, js, pdf, etc. files that only a
>> masochist would seek to edit by hand. Anyone with enough capacity to
>> maintain the Hays Free Press <https://haysfreepress.com/> site, is savvy
>> enough to either produce a WebFinger file on their own or to copy
>> the output from a simple web form.
>>
>> Allowing protocols to be limited to the low-bar of "unsophisticated"
>> users means that we're not able to provide "sophisticated" solutions when
>> they are needed. Over decades of experience with protocol and data format
>> design, we've learned that simple approaches inevitably lose their charm
>> after they have been in the field for some time. Users inevitably discover
>> new capabilities that they want to support. Requirements that were once
>> quite simple and well understood tend to become more complex and subtle as
>> time passes. Rather than waiting to discover the inadequacies of simple
>> formats, it makes a great deal of sense to initially rely on well-known
>> standard formats that allow extension, versioning, etc. Most "protocol
>> definers" should be focused on how to extend or exploit existing formats
>> while leaving the job of format definition to others who specialize in such
>> problems.
>>
>> For instance, the W3C Credible Web Community Group
>> <https://www.w3.org/community/credibility/> has defined a number of
>> signals that, I assume, a site might wish to self-assert in a discoverable,
>> well-known location. However, none of these signals are supported by the
>> trust.txt format. It seems to me that these signals could be usefully
>> included in a WebFinger file. These signals include:
>>
>>    - Date Website First Archived
>>    - Corrections Policy
>>    - Any Award
>>    - Pulitzer Prize Recognition
>>    - RNG Awards
>>
>> The choice here is: Should we call for trust.txt to be updated to include
>> these signals, and any others that might be defined in the future, or,
>> should we simply provide definitions of the JSON Resource Descriptors
>> (JRD's) or other encodings and thus, by implication enable those signals to
>> be supported in any format that supports those encodings? I suggest that
>> the more useful approach is to define what the signals mean and how they
>> should be encoded and then rely others to find the various places where
>> those encodings would be most useful. I this approach had been used in
>> defining trust.txt, then all the various signals supported there, which are
>> not defined by CredWeb, would be easily used by anyone who is also using
>> CredWeb signals. (Being able to say: "I control the website xxx.xxx." is
>> useful in more contexts than just that defined by trust.txt.)
>>
>> bob wyman
>>
>>
>> On Sun, Aug 1, 2021 at 2:58 PM Scott Yates <scott@journallist.net> wrote:
>>
>>> Bob, and the group...
>>>
>>> Just to be clear, I am not running on a platform of trust.txt.
>>>
>>> On page 8 of the spec., we encourage the use of "/well-known" so we are
>>> clearly not against that.
>>>
>>> The short answer to why we went with a text file is that we are working
>>> with some extremely unsophisticated publishers. Take, for instance, the
>>> publisher of the Hays Free Press, whom I met recently in Texas. She prints
>>> news from her town on paper once a week, and maintains a website. As we all
>>> know, when local news dies, news consumers fill in that vacuum with crap.
>>> If she stops publishing, well, it would be bad, so we want to make things
>>> as easy as possible for her and those like her doing the esteemable work of
>>> keeping local journalism alive.
>>>
>>> In my conversation with her, she was willing to post a file
>>> <https://haysfreepress.com/trust.txt> in part because she already knew
>>> about ads.txt, and so this was familiar to her. If I tried to start telling
>>> her about RFC 7033, I would have lost her for sure. You are certainly right
>>> that JRDs would be technically superior, but robots.txt has been around for
>>> 20+ years and the most entry-level web publisher knows about how it works.
>>>
>>>
>>> Thank you for looking into trust.txt, and while I don't want people to
>>> vote for me based on what they think of trust.txt, I think your question
>>> serves as a useful model of why I am running. If you think that any new
>>> proposal that is working to fix disinformation should follow, for example,
>>> the most current standardized systems, you should voice that to the group.
>>> If the group agrees, then that will be a part of how trust.txt -- and every
>>> other effort out there -- will be evaluated.
>>>
>>> -Scott Yates
>>> Founder
>>> JournalList.net, caretaker of the trust.txt framework
>>> 202-742-6842
>>> Short Video Explanation of trust.txt <https://youtu.be/lunOBapQxpU>
>>>
>>>
>>> On Sun, Aug 1, 2021 at 11:53 AM Bob Wyman <bob@wyman.us> wrote:
>>>
>>>> Scott Yates, in his statement of candidacy
>>>> <https://lists.w3.org/Archives/Public/public-credibility/2021Aug/0000.html>,
>>>> includes a description of the trust.txt file
>>>> <https://journallist.net/reference-document-for-trust-txt-specifications>
>>>> .
>>>>
>>>> Please explain why it makes sense to introduce yet-another .txt file
>>>> (in addition to robots.txt and ads.txt) when we have established procedures
>>>> to allow those who control URIs to make statements supported by that
>>>> control. For instance, RFC 5785
>>>> <https://datatracker.ietf.org/doc/html/rfc5785> defines the
>>>> "/.well-known/" path prefix for "well-known locations" which are accessed
>>>> via URIs. It seems to me that if one were to publish a trust.txt file, then
>>>> it should be at the location "/.well-known/trust.txt" That does not seem to
>>>> be the current proposal. Why are existing standards not being followed?
>>>>
>>>> It also seems to me that the proposed file format is an unnecessary
>>>> departure from existing standards such as RFC 7033
>>>> <https://datatracker.ietf.org/doc/html/rfc7033>, which defined
>>>> WebFinger, a mechanism that could be easily used to carry the data which
>>>> the proponents of trust.txt seek to make available. To make WebFinger do
>>>> what trust.txt intends, it would be only necessary to register a few new
>>>> JSON Resource Descriptors (JRDs), properties, or link-relations (i.e.
>>>> belong-to, control, social, member, etc.). This sort of extension is
>>>> provided for in the definition of RFC 7033 and in RFC 5988
>>>> <https://datatracker.ietf.org/doc/html/rfc5988>, which defines "Web
>>>> Linking" mechanisms. Note: The existing set of defined link-relations can
>>>> be found in the IANA maintained link-relations registry
>>>> <https://www.iana.org/assignments/link-relations/link-relations.xhtml>.
>>>>
>>>> While there will be a never-ending need to add support for new kinds of
>>>> standardized statements, discoverable in well-known locations, I think we
>>>> should be careful to ensure that new kinds of statements make use of
>>>> existing standards rather than define entirely new mechanisms. I can't see
>>>> anything in the trust.txt specification that actually requires a unique,
>>>> non-standard approach that is not already supported by the various
>>>> standards referenced above.
>>>>
>>>> bob wyman
>>>>
>>>>
Received on Monday, 2 August 2021 00:54:27 UTC