Re: First-Party sets and the potential application of the JournalList trust.txt specification

Hi all,

Thanks for the discussion here. Also, thanks to Don for the observations
regarding relevance to First-Party Sets (FPS), and pointers to previous
discussions.

@Ralph: If you'd like to discuss this topic on Thursday's call, could you
please create an issue on
https://github.com/privacycg/first-party-sets/issues and add the agenda+
label as is the process?

My own notes:
* I think this work by JournalList.net (as well as other Prior Art [1][2])
highlights the need for a web platform mechanism that can bridge that gap
between domain names, and real-world/human definitions and expectations of
privacy and trust. While the goals for trust.txt don't completely overlap
with that of FPS; I do see parallels in the "control" and "controlledby"
relationships to the proposed UA Policy [3] for First-Party Sets. I tend to
agree with previous comments on this thread that the FPS policy will likely
center around stewardship of user data, while JournalList/trust.txt may
want to look at control of content.
* The current version of our proposal [3] proposes a priori verification
via publicly viewable attestations that the domains comprising a FPS
conform to the UA Policy (which which becomes part of the entity's privacy
representations to users, and subject to scrutiny under consumer protection
laws), coupled with random spot checks; with post hoc review/revocation
mechanisms for cases flagged by users/civil society. We are wary of a
post-hoc-only approach because our proposed applications in the browser for
FPS have implications for user privacy. Another consideration is to avoid
the formation of "personalized sets" (domains asserting different
relationships to different users). Our goal is to balance prevention of
misuse with scalability.
* I do think it's possible to attach an a-priori verification process to
trust.txt. In fact, our alternative design [4] is based on a similar
mechanism that fetches previously verified assertions from a .well-known
location. While such an approach is definitely more scalable; there are
several challenges that we need to overcome. We propose starting with a
simpler centralized static-list based approach for now, and eventually
migrating to a more dynamic mechanism as the system scales and matures.
While not insurmountable, some of the challenges to consider for a dynamic
design are:
** Browser implementation complexity. We identified some initial details in
[5], but expect to discover more as we examine interactions with various
platform primitives, and our intended applications for FPS.
** Dealing with unavailability of the "controlling" domain's server (since
the browser needs to fetch ap.org/trust.txt before accepting an assertion
of membership by https://apnews.com/trust.txt). FPS has implications for
web platform mechanisms such as access to cross-site cookies, so inability
to verify membership may lead to site performance latency, or issues with
site functionality.
** Potentially less transparency on how/when domains assert relationships,
unless there is a verifying entity/entities that maintain(s) publicly
auditable logs, similar to Certificate Transparency logs [6].

Best,
Kaustubha Govind
Co-editor of First-Party Sets, working on Chrome Privacy Sandbox @Google

[1] https://github.com/privacycg/first-party-sets#prior-art
[] https://www.w3.org/TR/tracking-dnt/#terminology.participants -
definition of "party"
[3]
https://github.com/privacycg/first-party-sets/blob/main/ua_policy_proposal.md
[4]
https://github.com/privacycg/first-party-sets/blob/main/signed_assertions.md
[5]
https://github.com/privacycg/first-party-sets/blob/main/signed_assertions.md#discovering-first-party-sets
[6] https://certificate.transparency.dev/howctworks/

On Mon, Jan 10, 2022 at 8:22 PM Ralph Brown <ralph@brownwolfconsulting.com>
wrote:

> Don,
>
> Thanks for the pointers. I will read up before Thursday’s call.
>
> Ralph
> --
> Ralph W. Brown
> Founder
> Brown Wolf Consulting LLC
> m: +1-303-517-6711 <(303)%20517-6711>
> e: ralph@brownwolfconsulting.com
> w: www.brownwolfconsulting.com
>
> On Jan 10, 2022, at 5:17 PM, Don Marti <dmarti@cafemedia.com> wrote:
>
> 
> Hi Ralph, David,
>
> Ralph, that's a good question. FPS membership standards are still being
> discussed in this community group and at the W3C TAG. I'll link to some
> notes on previous conversations.
>
> Some good points on FPS membership standards on the TAG review issue,
> here: https://github.com/w3ctag/design-reviews/issues/342
>
> Ad hoc meeting:
> https://github.com/privacycg/meetings/blob/main/2021/telcons/08-12-21-FPS-adhoc-minutes.md
>
> Meeting from last fall:
> https://github.com/privacycg/meetings/blob/debc2dc09ddd9af6444a7639b36213b0209381ff/2021/telcons/09-09-minutes.md#first-party-sets---replace-ownercontroller-language-with-simpler-language-on-controller-and-define-it-pr-56
>
> David, the problem of how does the "Independent Enforcement Entity" (IEE)
> know if sites are in a valid FPS seems to be more or less manageable
> depending on what the standards of validity are. If the IEE is supposed to
> be able to check corporate ownership, that would be an additional, hard
> problem on top of the problem of checking whether the FPS members were
> adhering to their own claimed privacy policy. (I suggest that if sites
> assert an FPS, they must also grant permission to the IEE to test it:
> https://github.com/privacycg/first-party-sets/pull/65 )
>
> Best,
> Don
>
> On Mon, Jan 10, 2022 at 3:18 PM David Singer <singer@apple.com> wrote:
>
>> Is it, from the privacy point of view, a way to address the question we
>> raised during DNT (of blessed memory): how does a user/researcher/regulator
>> ’know’ that yimg.com and yahoo.com share a data controller, and are
>> under common oversight and management?
>>
>> > On 10Jan, 2022, at 14:11 , Ralph Brown <ralph@brownwolfconsulting.com>
>> wrote:
>> >
>> > Don,
>> >
>> > At JournalList we have been more concerned with the second and less the
>> first, while it appears to me that the the Privacy Community Group is more
>> concerned with the first and less the second. Both of course are important.
>> >
>> > I agree with the concern about getting into the weeds of corporate
>> ownership. I guess the question then is, what is sufficient representation
>> to confirm the controlling/controlled relationship implied by First-Party
>> Sets? Is this something that the group has already resolved?
>> >
>> > Regards,
>> >
>> > Ralph
>> > --
>> > Ralph W. Brown
>> > Founder
>> > Brown Wolf Consulting LLC
>> > 1355 S Foothills Hwy
>> > Boulder, CO 80305
>> > m: +1-303-517-6711 <(303)%20517-6711>
>> > e: ralph@brownwolfconsulting.com
>> > w: www.brownwolfconsulting.com
>> >
>> > <Brown Wolf Consulting Logo Trandemark Wide.jpg>
>> >
>> >> On Jan 10, 2022, at 3:04 PM, Don Marti <dmarti@cafemedia.com> wrote:
>> >>
>> >> Hi Ralph,
>> >>
>> >> Yes, it seems like in the case of JournalList there are two kinds of
>> control to be concerned about
>> >>
>> >>  * Controllership of processing of personal data
>> >>
>> >>  * Editorial control of content
>> >>
>> >> So far we have been trying to avoid getting into the weeds of
>> corporate ownership issues when defining common controller. Real-world web
>> and media companies have complicated structures that would be hard for a
>> browser vendor or enforcement entity to analyze, and it's way too easy to
>> set up corporate ownership arrangements where two sites have common
>> ownership on paper but are managed independently for purposes of user data
>> sharing and editorial decisions (
>> https://github.com/privacycg/first-party-sets/issues/49 )
>> >>
>> >> Best,
>> >> Don
>> >>
>> >> On Mon, Jan 10, 2022 at 1:06 PM Ralph Brown <
>> ralph@brownwolfconsulting.com> wrote:
>> >> Don,
>> >>
>> >> Thanks for the question. To this point, we haven’t been anymore
>> explicit about the relationships described in the trust.txt file than the
>> following language on page 6 of the specification:
>> >>
>> >> "While these roles are broadly described, there is an implied trust
>> relationship between organizations that fall into these respective roles.
>> This trust relationship is typically based on a legal agreement executed by
>> the respective parties (for example, a membership agreement or a purchase
>> agreement).”
>> >>
>> >> As you point out, the GDPR language you reference is specific to
>> control over the use of personal data collected by the site and the control
>> JournalList is interested in goes beyond this to include the content that
>> is published on the controlled sites.
>> >>
>> >> Perhaps there is a legal definition (or acceptable legal language) to
>> address the type of control that is relevant to both First-Party Sets and
>> JournalList. It strikes me that control in this sense is organizational
>> control either through ownership or equivalent influence over the policies
>> and practices of the controlled entity.
>> >>
>> >> We are always interested in improving the JournalList trust.txt
>> specification and open to input to improve it. A better definition of
>> control certainly makes sense.
>> >>
>> >> Regards,
>> >>
>> >> Ralph
>> >> --
>> >> Ralph W. Brown
>> >> Founder
>> >> Brown Wolf Consulting LLC
>> >> 1355 S Foothills Hwy
>> >> Boulder, CO 80305
>> >> m: +1-303-517-6711 <(303)%20517-6711>
>> >> e: ralph@brownwolfconsulting.com
>> >> w: www.brownwolfconsulting.com
>> >>
>> >> <Brown Wolf Consulting Logo Trandemark Wide.jpg>
>> >>
>> >>> On Jan 10, 2022, at 12:28 PM, Don Marti <dmarti@cafemedia.com> wrote:
>> >>>
>> >>> Hi Ralph,
>> >>>
>> >>> This could be very helpful. I do have a question about the "control"
>> and "controlledBy" fields, along with the definition of "control".
>> >>>
>> >>> Right now there is still an open topic of discussion about how
>> First-Party Sets will define common control for members of a set.
>> >>>
>> >>> There is a workable definition of "controller" in GDPR: "natural or
>> legal person, public authority, agency or other body which, alone or
>> jointly with others, determines the purposes and means of the processing of
>> personal data." FPS is intended to be international, but this definition is
>> the best one I have found so far.
>> >>>
>> >>> (For purposes of trust in journalism, data controller would probably
>> be necessary but not sufficient--the definition of control would have to
>> include content-related control.)
>> >>>
>> >>> Would you consider making the definition of "control" more specific,
>> to include the GDPR language or similar on data stewardship?
>> >>>
>> >>> Best,
>> >>> Don
>> >>>
>> >>>
>> >>> On Mon, Jan 10, 2022 at 10:55 AM Ralph Brown <
>> ralph@brownwolfconsulting.com> wrote:
>> >>> Fellow Privacy Community Group members,
>> >>>
>> >>> Scott Yates (Executive Director, JournalList.net) and I shared this
>> proposal with Kaustubha Govind last month and he recommended that we share
>> it with the group.
>> >>>
>> >>> The work on First-Party Sets recently came to our attention which
>> caused us to join the Privacy Community Group. We think it might be
>> interesting to have a conversation about what we do at JournalList.net,
>> which is publish the trust.txt specification document (attached).
>> >>>
>> >>> In short, it's a simple yet powerful way to expose relationship among
>> websites (spec here), including the relationships of  “control” and
>> “controlledby”.
>> >>>
>> >>> The original concept was to make the relationship among news
>> organizations (publishers) and press associations explicitly readable by
>> web browsers, web crawlers, programmatic ad buyers, researchers, etc. It is
>> beginning to gain adoption among a number of press organizations, including
>> the Associated Press and Digital Content Next.
>> >>>
>> >>> These symmetric relationships “control/controlledby”, (and others)
>> are beneficial as they can expose entities that attempt to overstate their
>> “control” or “membership” status. If the reciprocal relationship is not
>> expressed, one has to question the assertion of this relationship. For
>> example, if an entity attempts to overstate their “control” by including
>> websites over which they do not have control, a missing “controlledby”
>> relationship would expose this.
>> >>>
>> >>> In other words, if ap.org/trust.txt expressed that it controls
>> https://apnews.com/trust.txt, that would be a quick and seamless way for
>> a browser to ingest a first-party relationship. If scammysite.xyz
>> expressed that it had a first-party relationship with ap.org, that would
>> be easily disproved by looking at ap.org/trust.txt.
>> >>>
>> >>> By allowing entities to self publish their trust.txt file it avoids
>> the centralized submission/validation process, while other mechanisms can
>> be used post-hoc to validate/police the self published trust.txt files.
>> >>>
>> >>> We welcome a discussion among the group on this proposal.
>> >>>
>> >>> Regards,
>> >>>
>> >>> Scott Yates & Ralph Brown
>> >>> --
>> >>> Ralph W. Brown
>> >>> Founder
>> >>> Brown Wolf Consulting LLC
>> >>> 1355 S Foothills Hwy
>> >>> Boulder, CO 80305
>> >>> m: +1-303-517-6711 <(303)%20517-6711>
>> >>> e: ralph@brownwolfconsulting.com
>> >>> w: www.brownwolfconsulting.com
>> >>>
>> >>> <Brown Wolf Consulting Logo Trandemark Wide.jpg>
>> >>>
>> >>
>> >
>>
>> David Singer
>> Multimedia and Software Standards, Apple
>>
>> singer@apple.com
>>
>>
>>
>>
>>
>>

Received on Tuesday, 11 January 2022 05:23:08 UTC