Re: Trust.txt: Why another random .txt when we've got WebFinger and well-known URIs? from Scott Yates on 2021-08-02 (public-credibility@w3.org from August 2021)

From: Scott Yates <scott@journallist.net>
Date: Sun, 1 Aug 2021 19:08:53 -0600
To: Bob Wyman <bob@wyman.us>
Cc: Credible Web CG <public-credibility@w3.org>
Message-ID: <CAJcW4ANjR2cmdWhimmUMtc5Egn1B9P8QZeDWdjNv9Am8L57__A@mail.gmail.com>
You wrote:

I believe it appropriate for this group to analyze a wide variety of
implementations, existing standards, proposals, etc.


Yes!!!

That is my platform, and that is what I think we should do for trust.txt, *and
for dozens of other initiatives.*

I'm just saying that we should do that in a systematic way, and not in an
email thread prompted by a new candidate for the chair. Let's do it in a
way that the people behind each initiative and the broader community can
all learn from the analysis.


-Scott Yates
Founder
JournalList.net, caretaker of the trust.txt framework
202-742-6842
Short Video Explanation of trust.txt <https://youtu.be/lunOBapQxpU>


On Sun, Aug 1, 2021 at 6:54 PM Bob Wyman <bob@wyman.us> wrote:

> You wrote:
>
>> "If you think trust.txt should be run differently, that's fine, but it's
>> not at all related to the CredWeb group."
>
> I believe that CredWeb can and should learn from both the good and bad
> design decisions that have been made by all others who are attempting or
> have attempted to address the credibility problem. While I may have framed
> my comments as criticisms or observations concerning a specific effort,
> trust.txt, my intent is not actually to speak to the developers or
> advocates of trust.txt itself, but rather to discuss these things in the
> context of what CredWeb seeks to do.
>
> A question I've raised before on this list is "Who should publish
> signals?" Well, the trust.txt spec gives one answer to that in that it has
> found it useful for publishers to make claims about their own services.
> Also, trust.txt provides at least one answer to the question: "Where should
> the signals be published?" Trust.txt says: "In a web-accessible file,
> having a specific format, in a well-known location." These are
> useful answers to important questions that I believe should be considered
> further. Trust.txt has also answered the question: "What signals are
> useful?" and it has produced answers that are very different from those in
> the CredWeb documents. CredWeb should consider those additional signals and
> also wonder why trust.txt didn't independently identify the same signals
> that CredWeb did. (What was different in the approach of problem definition
> that motivated the differences in the identified signals?) Nonetheless,
> trust.txt's specification includes some details that I think are
> non-optimal. There is almost always good with bad, but we can, and should,
> learn from both.
>
> I believe it appropriate for this group to analyze a wide variety of
> implementations, existing standards, proposals, etc.
>
> bob wyman
>
>
> On Sun, Aug 1, 2021 at 8:08 PM Scott Yates <scott@journallist.net> wrote:
>
>> Bob,
>>
>> All due respect, but I think your comments are best addressed to me as
>> the founder of JournalList, and not to the entire CredWeb group.
>>
>> CredWeb has no authority over the trust.txt spec. Similarly, I am not
>> running in hopes that trust.txt will take over the CredWeb.
>>
>> I am running for the chair of the CredWeb in the same way that I might
>> run for the school board or a local museum board. I hope to do good work in
>> a field that I am interested in, and I have some capacity to contribute. I
>> also have a vision of what I think could be a very useful tool for the
>> entire spectrum of those fighting disinformation.
>>
>> If you think trust.txt should be run differently, that's fine, but it's
>> not at all related to the CredWeb group.
>>
>> -Scott Yates
>> Founder
>> JournalList.net, caretaker of the trust.txt framework
>> 202-742-6842
>> Short Video Explanation of trust.txt <https://youtu.be/lunOBapQxpU>
>>
>>
>> On Sun, Aug 1, 2021 at 5:20 PM Bob Wyman <bob@wyman.us> wrote:
>>
>>> You wrote:
>>>
>>>> "On page 8 of the spec., we encourage the use of "/well-known" so we
>>>> are clearly not against that." (link added)
>>>
>>> The trust.txt spec
>>> <https://journallist.net/reference-document-for-trust-txt-specifications>
>>> says, on page 8:
>>>
>>>> "In addition to the access method noted above, use of the “Well Known
>>>> Uniform Resource Identifiers” is recommended."
>>>
>>> So, the spec says that providing the file with a "/.well-known/" prefix
>>> is optional and should only be done if the file has also been provided
>>> without a prefix. As a result, there is absolutely no utility in having a
>>> copy of the file prefixed by "/.well-known/." Any smart coder would simply
>>> ignore that there might be a second copy of the file. In fact, one might
>>> argue that if a "well-known" file is found, but an unprefixed one is not
>>> found, the prefixed copy should be ignored since it may be that the site's
>>> intent was to delete the file, and they simply forgot to delete its copy.
>>> In any case, it is generally not a good idea, when defining protocols, to
>>> require or even recommend that data be provided in more than one place. The
>>> typical statement is something like: "If data is found in more than one
>>> place, it is probably wrong in all of them..."
>>>
>>> It would be very useful if the spec could be updated to *require* that
>>> only one copy of the file should be provided and that it should be provided
>>> with the "/.well-known/" prefix.
>>>
>>> Also, you wrote:
>>>
>>>> "The short answer to why we went with a text file is that we are
>>>> working with some extremely unsophisticated publishers."
>>>
>>> I sympathize with your concern for the unsophisticated publisher.
>>> However, any difficulty that might exist in the production of a more
>>> complex file would be easily overcome by providing a trivial web form that
>>> allowed "fill-in-the-blank" simplicity. The produced file could then be
>>> simply copied to the appropriate location. After all, we've moved beyond
>>> the time when everyone was expected to be able to edit files manually.
>>> Publishers deal daily with xml, html, css, js, pdf, etc. files that only a
>>> masochist would seek to edit by hand. Anyone with enough capacity to
>>> maintain the Hays Free Press <https://haysfreepress.com/> site, is
>>> savvy enough to either produce a WebFinger file on their own or to copy
>>> the output from a simple web form.
>>>
>>> Allowing protocols to be limited to the low-bar of "unsophisticated"
>>> users means that we're not able to provide "sophisticated" solutions when
>>> they are needed. Over decades of experience with protocol and data format
>>> design, we've learned that simple approaches inevitably lose their charm
>>> after they have been in the field for some time. Users inevitably discover
>>> new capabilities that they want to support. Requirements that were once
>>> quite simple and well understood tend to become more complex and subtle as
>>> time passes. Rather than waiting to discover the inadequacies of simple
>>> formats, it makes a great deal of sense to initially rely on well-known
>>> standard formats that allow extension, versioning, etc. Most "protocol
>>> definers" should be focused on how to extend or exploit existing formats
>>> while leaving the job of format definition to others who specialize in such
>>> problems.
>>>
>>> For instance, the W3C Credible Web Community Group
>>> <https://www.w3.org/community/credibility/> has defined a number of
>>> signals that, I assume, a site might wish to self-assert in a discoverable,
>>> well-known location. However, none of these signals are supported by the
>>> trust.txt format. It seems to me that these signals could be usefully
>>> included in a WebFinger file. These signals include:
>>>
>>>    - Date Website First Archived
>>>    - Corrections Policy
>>>    - Any Award
>>>    - Pulitzer Prize Recognition
>>>    - RNG Awards
>>>
>>> The choice here is: Should we call for trust.txt to be updated to
>>> include these signals, and any others that might be defined in the future,
>>> or, should we simply provide definitions of the JSON Resource Descriptors
>>> (JRD's) or other encodings and thus, by implication enable those signals to
>>> be supported in any format that supports those encodings? I suggest that
>>> the more useful approach is to define what the signals mean and how they
>>> should be encoded and then rely others to find the various places where
>>> those encodings would be most useful. I this approach had been used in
>>> defining trust.txt, then all the various signals supported there, which are
>>> not defined by CredWeb, would be easily used by anyone who is also using
>>> CredWeb signals. (Being able to say: "I control the website xxx.xxx." is
>>> useful in more contexts than just that defined by trust.txt.)
>>>
>>> bob wyman
>>>
>>>
>>> On Sun, Aug 1, 2021 at 2:58 PM Scott Yates <scott@journallist.net>
>>> wrote:
>>>
>>>> Bob, and the group...
>>>>
>>>> Just to be clear, I am not running on a platform of trust.txt.
>>>>
>>>> On page 8 of the spec., we encourage the use of "/well-known" so we are
>>>> clearly not against that.
>>>>
>>>> The short answer to why we went with a text file is that we are working
>>>> with some extremely unsophisticated publishers. Take, for instance, the
>>>> publisher of the Hays Free Press, whom I met recently in Texas. She prints
>>>> news from her town on paper once a week, and maintains a website. As we all
>>>> know, when local news dies, news consumers fill in that vacuum with crap.
>>>> If she stops publishing, well, it would be bad, so we want to make things
>>>> as easy as possible for her and those like her doing the esteemable work of
>>>> keeping local journalism alive.
>>>>
>>>> In my conversation with her, she was willing to post a file
>>>> <https://haysfreepress.com/trust.txt> in part because she already knew
>>>> about ads.txt, and so this was familiar to her. If I tried to start telling
>>>> her about RFC 7033, I would have lost her for sure. You are certainly right
>>>> that JRDs would be technically superior, but robots.txt has been around for
>>>> 20+ years and the most entry-level web publisher knows about how it works.
>>>>
>>>>
>>>> Thank you for looking into trust.txt, and while I don't want people to
>>>> vote for me based on what they think of trust.txt, I think your question
>>>> serves as a useful model of why I am running. If you think that any new
>>>> proposal that is working to fix disinformation should follow, for example,
>>>> the most current standardized systems, you should voice that to the group.
>>>> If the group agrees, then that will be a part of how trust.txt -- and every
>>>> other effort out there -- will be evaluated.
>>>>
>>>> -Scott Yates
>>>> Founder
>>>> JournalList.net, caretaker of the trust.txt framework
>>>> 202-742-6842
>>>> Short Video Explanation of trust.txt <https://youtu.be/lunOBapQxpU>
>>>>
>>>>
>>>> On Sun, Aug 1, 2021 at 11:53 AM Bob Wyman <bob@wyman.us> wrote:
>>>>
>>>>> Scott Yates, in his statement of candidacy
>>>>> <https://lists.w3.org/Archives/Public/public-credibility/2021Aug/0000.html>,
>>>>> includes a description of the trust.txt file
>>>>> <https://journallist.net/reference-document-for-trust-txt-specifications>
>>>>> .
>>>>>
>>>>> Please explain why it makes sense to introduce yet-another .txt file
>>>>> (in addition to robots.txt and ads.txt) when we have established procedures
>>>>> to allow those who control URIs to make statements supported by that
>>>>> control. For instance, RFC 5785
>>>>> <https://datatracker.ietf.org/doc/html/rfc5785> defines the
>>>>> "/.well-known/" path prefix for "well-known locations" which are accessed
>>>>> via URIs. It seems to me that if one were to publish a trust.txt file, then
>>>>> it should be at the location "/.well-known/trust.txt" That does not seem to
>>>>> be the current proposal. Why are existing standards not being followed?
>>>>>
>>>>> It also seems to me that the proposed file format is an unnecessary
>>>>> departure from existing standards such as RFC 7033
>>>>> <https://datatracker.ietf.org/doc/html/rfc7033>, which defined
>>>>> WebFinger, a mechanism that could be easily used to carry the data which
>>>>> the proponents of trust.txt seek to make available. To make WebFinger do
>>>>> what trust.txt intends, it would be only necessary to register a few new
>>>>> JSON Resource Descriptors (JRDs), properties, or link-relations (i.e.
>>>>> belong-to, control, social, member, etc.). This sort of extension is
>>>>> provided for in the definition of RFC 7033 and in RFC 5988
>>>>> <https://datatracker.ietf.org/doc/html/rfc5988>, which defines "Web
>>>>> Linking" mechanisms. Note: The existing set of defined link-relations can
>>>>> be found in the IANA maintained link-relations registry
>>>>> <https://www.iana.org/assignments/link-relations/link-relations.xhtml>
>>>>> .
>>>>>
>>>>> While there will be a never-ending need to add support for new kinds
>>>>> of standardized statements, discoverable in well-known locations, I think
>>>>> we should be careful to ensure that new kinds of statements make use of
>>>>> existing standards rather than define entirely new mechanisms. I can't see
>>>>> anything in the trust.txt specification that actually requires a unique,
>>>>> non-standard approach that is not already supported by the various
>>>>> standards referenced above.
>>>>>
>>>>> bob wyman
>>>>>
>>>>>
Received on Monday, 2 August 2021 01:10:21 UTC