Re: Trust.txt: Why another random .txt when we've got WebFinger and well-known URIs? from Bob Wyman on 2021-08-01 (public-credibility@w3.org from August 2021)

From: Bob Wyman <bob@wyman.us>
Date: Sun, 1 Aug 2021 19:19:52 -0400
To: Scott Yates <scott@journallist.net>
Cc: Credible Web CG <public-credibility@w3.org>
Message-ID: <CAA1s49XYoeuqAvb7m+Nnm=J0HDxH7Ja8w4QUVxzgXouUmUrNoQ@mail.gmail.com>
You wrote:

> "On page 8 of the spec., we encourage the use of "/well-known" so we are
> clearly not against that." (link added)

The trust.txt spec
<https://journallist.net/reference-document-for-trust-txt-specifications>
says, on page 8:

> "In addition to the access method noted above, use of the “Well Known
> Uniform Resource Identifiers” is recommended."

So, the spec says that providing the file with a "/.well-known/" prefix is
optional and should only be done if the file has also been provided without
a prefix. As a result, there is absolutely no utility in having a copy of
the file prefixed by "/.well-known/." Any smart coder would simply ignore
that there might be a second copy of the file. In fact, one might argue
that if a "well-known" file is found, but an unprefixed one is not found,
the prefixed copy should be ignored since it may be that the site's intent
was to delete the file, and they simply forgot to delete its copy. In any
case, it is generally not a good idea, when defining protocols, to require
or even recommend that data be provided in more than one place. The typical
statement is something like: "If data is found in more than one place, it
is probably wrong in all of them..."

It would be very useful if the spec could be updated to *require* that only
one copy of the file should be provided and that it should be provided with
the "/.well-known/" prefix.

Also, you wrote:

> "The short answer to why we went with a text file is that we are working
> with some extremely unsophisticated publishers."

I sympathize with your concern for the unsophisticated publisher. However,
any difficulty that might exist in the production of a more complex file
would be easily overcome by providing a trivial web form that allowed
"fill-in-the-blank" simplicity. The produced file could then be simply
copied to the appropriate location. After all, we've moved beyond the time
when everyone was expected to be able to edit files manually. Publishers
deal daily with xml, html, css, js, pdf, etc. files that only a masochist
would seek to edit by hand. Anyone with enough capacity to maintain the Hays
Free Press <https://haysfreepress.com/> site, is savvy enough to either
produce a WebFinger file on their own or to copy the output from a simple
web form.

Allowing protocols to be limited to the low-bar of "unsophisticated" users
means that we're not able to provide "sophisticated" solutions when they
are needed. Over decades of experience with protocol and data format
design, we've learned that simple approaches inevitably lose their charm
after they have been in the field for some time. Users inevitably discover
new capabilities that they want to support. Requirements that were once
quite simple and well understood tend to become more complex and subtle as
time passes. Rather than waiting to discover the inadequacies of simple
formats, it makes a great deal of sense to initially rely on well-known
standard formats that allow extension, versioning, etc. Most "protocol
definers" should be focused on how to extend or exploit existing formats
while leaving the job of format definition to others who specialize in such
problems.

For instance, the W3C Credible Web Community Group
<https://www.w3.org/community/credibility/> has defined a number of signals
that, I assume, a site might wish to self-assert in a discoverable,
well-known location. However, none of these signals are supported by the
trust.txt format. It seems to me that these signals could be usefully
included in a WebFinger file. These signals include:

   - Date Website First Archived
   - Corrections Policy
   - Any Award
   - Pulitzer Prize Recognition
   - RNG Awards

The choice here is: Should we call for trust.txt to be updated to include
these signals, and any others that might be defined in the future, or,
should we simply provide definitions of the JSON Resource Descriptors
(JRD's) or other encodings and thus, by implication enable those signals to
be supported in any format that supports those encodings? I suggest that
the more useful approach is to define what the signals mean and how they
should be encoded and then rely others to find the various places where
those encodings would be most useful. I this approach had been used in
defining trust.txt, then all the various signals supported there, which are
not defined by CredWeb, would be easily used by anyone who is also using
CredWeb signals. (Being able to say: "I control the website xxx.xxx." is
useful in more contexts than just that defined by trust.txt.)

bob wyman


On Sun, Aug 1, 2021 at 2:58 PM Scott Yates <scott@journallist.net> wrote:

> Bob, and the group...
>
> Just to be clear, I am not running on a platform of trust.txt.
>
> On page 8 of the spec., we encourage the use of "/well-known" so we are
> clearly not against that.
>
> The short answer to why we went with a text file is that we are working
> with some extremely unsophisticated publishers. Take, for instance, the
> publisher of the Hays Free Press, whom I met recently in Texas. She prints
> news from her town on paper once a week, and maintains a website. As we all
> know, when local news dies, news consumers fill in that vacuum with crap.
> If she stops publishing, well, it would be bad, so we want to make things
> as easy as possible for her and those like her doing the esteemable work of
> keeping local journalism alive.
>
> In my conversation with her, she was willing to post a file
> <https://haysfreepress.com/trust.txt> in part because she already knew
> about ads.txt, and so this was familiar to her. If I tried to start telling
> her about RFC 7033, I would have lost her for sure. You are certainly right
> that JRDs would be technically superior, but robots.txt has been around for
> 20+ years and the most entry-level web publisher knows about how it works.
>
>
> Thank you for looking into trust.txt, and while I don't want people to
> vote for me based on what they think of trust.txt, I think your question
> serves as a useful model of why I am running. If you think that any new
> proposal that is working to fix disinformation should follow, for example,
> the most current standardized systems, you should voice that to the group.
> If the group agrees, then that will be a part of how trust.txt -- and every
> other effort out there -- will be evaluated.
>
> -Scott Yates
> Founder
> JournalList.net, caretaker of the trust.txt framework
> 202-742-6842
> Short Video Explanation of trust.txt <https://youtu.be/lunOBapQxpU>
>
>
> On Sun, Aug 1, 2021 at 11:53 AM Bob Wyman <bob@wyman.us> wrote:
>
>> Scott Yates, in his statement of candidacy
>> <https://lists.w3.org/Archives/Public/public-credibility/2021Aug/0000.html>,
>> includes a description of the trust.txt file
>> <https://journallist.net/reference-document-for-trust-txt-specifications>
>> .
>>
>> Please explain why it makes sense to introduce yet-another .txt file (in
>> addition to robots.txt and ads.txt) when we have established procedures to
>> allow those who control URIs to make statements supported by that control.
>> For instance, RFC 5785 <https://datatracker.ietf.org/doc/html/rfc5785> defines
>> the "/.well-known/" path prefix for "well-known locations" which are
>> accessed via URIs. It seems to me that if one were to publish a trust.txt
>> file, then it should be at the location "/.well-known/trust.txt" That does
>> not seem to be the current proposal. Why are existing standards not being
>> followed?
>>
>> It also seems to me that the proposed file format is an unnecessary
>> departure from existing standards such as RFC 7033
>> <https://datatracker.ietf.org/doc/html/rfc7033>, which defined
>> WebFinger, a mechanism that could be easily used to carry the data which
>> the proponents of trust.txt seek to make available. To make WebFinger do
>> what trust.txt intends, it would be only necessary to register a few new
>> JSON Resource Descriptors (JRDs), properties, or link-relations (i.e.
>> belong-to, control, social, member, etc.). This sort of extension is
>> provided for in the definition of RFC 7033 and in RFC 5988
>> <https://datatracker.ietf.org/doc/html/rfc5988>, which defines "Web
>> Linking" mechanisms. Note: The existing set of defined link-relations can
>> be found in the IANA maintained link-relations registry
>> <https://www.iana.org/assignments/link-relations/link-relations.xhtml>.
>>
>> While there will be a never-ending need to add support for new kinds of
>> standardized statements, discoverable in well-known locations, I think we
>> should be careful to ensure that new kinds of statements make use of
>> existing standards rather than define entirely new mechanisms. I can't see
>> anything in the trust.txt specification that actually requires a unique,
>> non-standard approach that is not already supported by the various
>> standards referenced above.
>>
>> bob wyman
>>
>>
Received on Sunday, 1 August 2021 23:20:18 UTC