Re: Negotiating protocols between actors or clients

Per brief discussion at the tail end of the CG meeting today, I'm bringing
this thread back to the attention of the mailing list:

- https://lists.w3.org/Archives/Public/public-swicg/2024Nov/0010.html
- also available at
https://socialhub.activitypub.rocks/t/negotiating-protocols-between-actors-or-clients/4701
as a forum thread
- also available at https://github.com/w3c/activitypub/issues/510 as a
GitHub issue

Tantek suggested that I file this as a general issue against ActivityPub,
which I have done, although I still believe a CG task force to explore the
issue would make sense, since this issue is more exploratory and doesn't
have a clear-cut resolution.

This issue has relevance to the following CG Task Forces as well:

- Forum TF. Behaviors around managing and tracking threaded conversations
are not specified within ActivityPub.
- Group TF. Behaviors around Join/Leave, a possible `members` collection,
etc. are also not specified within ActivityPub.
- E2EE TF. Behaviors around MLS-relevant activities are not specified
within ActivityPub.

---

To recap the prior email from November 2024:

> # Negotiating protocols between actors or clients
>
> Hello all,
>
> I know that the CG meeting next week is likely to have its time filled
with discussions of potential charters, but I'd like to bring up a
potential CG meeting agenda item:
>
> PROPOSAL: The SocialCG should convene a task force to document protocols
built on top of ActivityPub, including how to negotiate protocols
client-to-client or actor-to-actor.
>
> The roles and responsibilities of this task force would include, to start
with:
>
> - (1) Defining a mechanism to signal which protocols are supported, or in
other words, which total set of behaviors will be carried out when an
activity is received at an outbox or delivered to an inbox. Behaviors can
be broken down into the following categories:
>
>   - (1.1) Server behaviors. Which actions can be carried out
automatically by a server? How can an actor detect that a server will
understand and carry out side effects? Example: When an `Announce` is
received in an inbox, a server might automatically add it to the `shares`
collection... or it might not.
>
>   - (1.2) Client behaviors. Which actions are intended for clients to
understand and do something with them? How can an actor negotiate a session
between two clients? Example: Alice is using an E2EE messenger (or chess
application), and wants to establish an E2EE conversation (or chess game)
specifically with a client attached to Bob's actor or profile.
>
>   - (1.3) Actor behaviors. Which actions can the actor take or respond
with, in response to a given activity? How can an actor be expected to act?
Example: When Alice receives an activity referencing an object that has set
`inReplyTo` or `context` or etc. referencing an object belonging to Alice,
then Alice might want to Add that object to an appropriate collection. This
behavior might also occur at the client level, if the client is configured
to automatically manage those collections. This behavior might also occur
at the server level, if the server is configured to automatically manage
those collections.
>
> - (2) Documenting one or more common profiles for a total set of
behaviors as described above.
>
>   - (2.1) Option: A protocol for the management and replication of
resources across servers. Define behaviors for
Create/Update/Delete/Add/Remove at the server level, client level, and
actor level. Define where and how resources are stored, how long resources
are stored or cached, etc.
>
>   - (2.2) Option: A protocol for publishing "posts" to a "profile".
Define what is a "post", what is a "profile", etc. (Some overlap with the
Forum TF here, especially once you get to the level of defining
"conversation" and "forum".)
>
>   - (2.3) Option: A protocol for publishing "activities" to an "activity
stream". Define generic processing and shape that an activity must fit.
Define how to identify these activities as simple notifications, in cases
where side effects might not be needed or desired.
>
> - (3) Generating one or more reports for the above items.
>
> I would love to know what other people think about this agenda item or
general issue, and I welcome discussion of this issue both on the mailing
list and in other venues.

Additional insights from the forum thread:

> ActivityPub guarantees you that the thing arriving in your inbox is “an
AS2 document that is specifically an Activity”; the protocols of the
fediverse generally imply a certain shape to those activities and their
objects. More importantly, they expect behaviors or side effects.
>
> [...] Side effects triggered by certain activities could be layered on
top of [ActivityPub delivery] by a separate protocol or by an application
like IFTTT – for example, when the application sees that a certain actor
performed an activity with a type of Listen and the object is an Audio
object, then the Audio object might be added to an OrderedCollection
representing a playlist of audio to check out later.
>
> [...] The ultimate goal is that, when looking at any resource that has an
ldp:inbox, you have “more than zero” knowledge about what might happen if
you send something to that inbox. Less assuming, more knowing.
>
> [...] as a user i want my client to be able to assist me in making an
informed decision on whether i should or should not send (or have my
client/user-agent send) arbitrary activities to arbitrary actors. i want to
be able to look at an actor and say things like:
>
> - “yep, i have a pretty good idea that if i send them a Florp Ping, then
they’ll understand what that means.”
> - “eh, i shouldn’t bother sending this actor anything other than a Create
Note where the Create has these properties and the Note has these
properties, and i understand that this will be converted to a Status and i
need to send them an explicit Delete Note later because they don’t
understand HTTP caching headers…”
> - “so this actor is specifically a conversation manager, and in order to
participate in any conversations i need to send it a Create where the
object of that activity has at least context and content, otherwise it will
get ignored entirely.”
> - “wait, this isn’t an activitypub actor at all, it’s just a resource
with an inbox. it’s following some other protocol entirely and expecting
LDN payloads of a different type or shape.”
>
> [...] you hand someone jsonld, as2, ldn, and ap – what are they missing?
hand them apwf and ap-http-sig and they can maybe send messages now that
don’t get dropped on the floor, but what actually goes in those messages?
if you want to trigger a certain behavior, what is the message you need to
send to trigger that behavior? and if you send the same message to someone
else, will they understand it in the same way?

---

I am aware of the following prior art:

## Declaring an `ldp:inbox` to be `ldp:constrainedBy` some resource (a
profile or specification?)

`http://www.w3.org/ns/ldp#constrainedBy` is described in
https://www.w3.org/TR/ldn/#constraints as follows:

> Inbox URLs can announce their own constraints (e.g., [SHACL](
https://www.w3.org/TR/ldn/#bib-shacl), [Web Annotation Protocol](
https://www.w3.org/TR/ldn/#bib-annotation-protocol)) via an HTTP `Link`
header or body of the resource with a `rel` value of `
http://www.w3.org/ns/ldp#constrainedBy`. Senders should comply with
constraint specifications or the receiver may reject their notification and
return an appropriate 4xx error code.

Web Annotation Protocol uses this header:

```http
Link: <http://www.w3.org/TR/annotation-protocol/>; rel="
http://www.w3.org/ns/ldp#constrainedBy"
```

Presumably the closest analogue would be something like this:

```http
Link: <http://www.w3.org/TR/activitypub/>; rel="
http://www.w3.org/ns/ldp#constrainedBy"
```

Or even embedding it into the JSON-LD representation:

```json
{
  "@context": [
    "https://www.w3.org/ns/activitystreams",
    {"constrainedBy": {"@id": "http://www.w3.org/ns/ldp#constrainedBy",
"@type": "@id"}}
  ],
  "id": "https://actor.example/",
  "inbox": {
    "id": "https://actor.example/inbox/",
    "constrainedBy": "http://www.w3.org/TR/activitypub/"
  }
}
```

This is perhaps not ideal, because:

- It relies on a lot of out-of-band knowledge (unless you use something
like SHACL and/or RDFS to describe that knowledge).
- Constraints may vary over time and thus should be versioned.
- There is no inherent decomposition into separate features.
- The scope of this link relation is more closely tied to validating
payloads than it is describing side effects or behaviors.

The exploratory work of the proposed CG Task Force would be to identify
constraints of various applications that use ActivityPub, so that we can
know and describe ahead-of-time that for example, certain properties are
required to be present on a `Create` activity, or else it will be dropped.

## Discovery in XMPP

Two XEPs are used for similar purposes in XMPP:

### XEP-0030: Service Discovery https://xmpp.org/extensions/xep-0030.html

An IQ (Information Query) stanza asks to "get" a `<query>` of `
http://jabber.org/protocol/disco#info`. You respond with an IQ "result"
which contains one or more `<identity>` and one or more `<feature>`
elements.

For example, the XMPP client `romeo@montague.net/orchard` can discover that
`plays.shakespeare.lit` supports MUC multi-user chatrooms (`<feature var='
http://jabber.org/protocol/muc'/>`). The client `orchard` now knows it can
query for `disco#items` and discover various rooms, as well as discovering
information (`disco#info`) about those rooms. The client can use the MUC
protocol to join those rooms and send messages to those rooms (which will
distribute the messages to all participants).

Without doing this discovery, the client would have no idea if a given XMPP
resource supported MUC. If you send someone an invite to a chat room, you
need to know beforehand whether they will understand that invite at all.
You can `disco#info` about your contacts and determine that yes, they
support `<feature var='http://jabber.org/protocol/muc'/>`, so go ahead and
send that MUC invite.

By contrast, in ActivityPub, we can only blindly send an Invite activity
with zero information about what will happen when it is received... if
anything happens at all. Remember, Invite is not defined to have side
effects in ActivityPub! We also have no idea whether the Invite is to an
Event or whether it is to a Group, and we don't know whether a given
client/actor supports one or the other or both or none of these activity
shapes.

### XEP-0115: Entity Capabilities https://xmpp.org/extensions/xep-0115.html

As an optimization, XMPP clients can advertise in their `<presence>` that a
node has a capability set which will hash to a certain verification value.
A `disco#info` `<query>` reveals what those capabilities are, and the
capability set can be cached for as long as the verification value is
unchanged. Your client can now remember that capability set and assume it
for any entity it sees advertising that same verification value.

---

Evan expressed some concerns in the CG meeting today about "segmenting the
network into fragments", but I believe that being upfront about constraints
and behaviors will not meaningfully segment or fragment the network. The
alternative is far worse -- a significant subset (possibly even the
majority?) of activities flowing through the network might one day be pure
noise that gets completely dropped. (Maybe that activity was really
important!) Or worse yet, an activity gets processed in a completely
different way from what you were expecting, because it's just barely
similar enough to fit the shape while the behaviors are mismatched. The
point is that we don't know what's going on with the other end, and we just
have to guess at the potential outcomes.

I understand and appreciate that ActivityPub was designed very broadly to
support general distribution of activities, but when those activities are
actually procedure calls or protocol methods (with intended side effects),
it's a very bad experience to not be able to expect anything. Maybe the
servers are all general enough to store and forward any Activity, but what
about the clients reading from that inbox? What about the actors performing
certain automated actions based on those activities?

---

I guess for the general outcome here, I would like to explore profiles for
shapes and schemas, and protocols for behaviors and constraints. With the
broadening horizons of other TF work, I think it would be prudent to work
towards being able to discover and negotiate these things rather than
blindly assuming that everyone will adopt all of the new work by
Forums/Groups/E2EE/etc wholesale. I don't want to do it alone, however.

Received on Friday, 13 June 2025 21:14:37 UTC