Re: "Evolving ActivityPub to reduce infrastructure costs"

While Erin points out, quite correctly, that the overhead of textual data
duplication is largely insignificant in a system that permits the
transmission of images, etc. I think we should be aware that anyone looking
at examples such as the one pointed out by Johannes is likely to at least
wonder why the apparent duplication is necessary and useful. If nothing
else, we should remember that there are still a few of us old folk who grew
up trying to optimize bandwidth utilization by ensuring that every byte of
every message was necessary. Remember, even though images are large, we
still have PNG and JPG formats that try to reduce their size... We aren't
completely insensitive to bandwidth requirements.

In every case I've seen, and I'm sure I have not seen them all, there is
duplication between the data in the "Activity" part of a message and the
"Object" part of the same message. If this is universally the case, and
will remain so, it seems that one could write a rule saying that an
Activity's "To:" information could be extracted from the Object. But,
things could change or perhaps I haven't seen all current uses.

I can imagine two obvious cases where the Activity and Object wouldn't
duplicate data or where the duplication would be necessary:

   - If encrypted Objects were permitted, the Activity would need to remain
   in clear text, but the Object itself would be opaque to intermediaries.
   - If Objects could be signed, intermediaries could not modify objects
   without breaking them. In this case, an Activity might contain BTO and BCC
   recipients that were not included within an enclosed Object. In fact, it is
   probably a good idea to avoid reflecting these "must remove" fields in
   Objects anyway. If they aren't going to be in the data which is delivered,
   why duplicate them?

Are there other cases that I missed?

Given that the apparent duplication will undoubtedly raise questions from
many who read the specs, I think we would be well served if those specs
could include some explanation for why and when the duplication is
necessary or useful. A simple statement like "It doesn't matter." is not
really satisfactory. If it really doesn't matter, then it is just sloppy.
If it does matter, then we should have a detailed explanation for why it
matters.

bob wyman


On Thu, Mar 16, 2023 at 5:33 AM Erin Shepherd <erin.shepherd@e43.eu> wrote:

> From experience running a well connected AP server, you could make the AP
> messages 10x larger and the bandwidth overhead would still be negligible.
>
> The dominant factor is, and always has been, images, video and just
> general media.
>
> - Erin
>
>
> On 16 March 2023 03:13:31 CET, Benjamin Goering <ben@bengo.co> wrote:
>>
>> I just think that if there is an impression “out there” by some people that ActivityPub needs to be evolved "to reduce infrastructure costs” then this group here should be aware of that.
>>>
>>
>> It’s hard to tell for sure whether there is an impression out there like this from this thread because the only evidence you presented was that you heard someone say it.
>> It would indeed be helpful if you invite your acquaintance to share some first-hand report to the list.
>>
>> On Mar 15, 2023, at 15:55, Johannes Ernst <johannes.ernst@gmail.com> wrote:
>>>
>>> I just think that if there is an impression “out there” by some people that ActivityPub needs to be evolved "to reduce infrastructure costs” then this group here should be aware of that.
>>>
>>
>>

Received on Thursday, 16 March 2023 13:54:41 UTC