Re: Portability task force topic - object identifiers across moves from Lisa Dusseault on 2024-04-17 (public-swicg@w3.org from April 2024)

From: Lisa Dusseault <lisa@dtinit.org>
Date: Tue, 16 Apr 2024 17:53:31 -0700
To: aaronngray@gmail.com
Cc: Andy Piper <andypiperuk@gmail.com>, Evan Prodromou <evan@prodromou.name>, Social Web Incubator Community Group <public-swicg@w3.org>
Message-ID: <CAH212UNZ+ZLBneNdMs1nrcauaC+s27V3j2aC4Q9CQ1CmpnruOQ@mail.gmail.com>
I'm going to continue to limit my contribution here to the portability of
Live OnLine Accounts (LOLA portability) because there are millions of
accounts already out there with content and followers, and giving them a
path to move to a different server if they choose to is eminently
solvable.  (Dmitri & I were talking today about building on that to provide
a path for "dead server" accounts to also move, but that's more work so for
me that's  tomorrow's problem.)

*ASSUMPTIONS*

0 (the use case).  The user has created a new account on a server, and
after reviewing the features & vibe, they'd like to move their content from
another server where they previously posted.

1.  That actor ID and names can and often will change in this kind of move,
in fact wanting to change one’s name can be a partial motivation.

2.  That ActivityPub specs don't make many requirements about object
IDs, and that all of these are legit and active samples of approaches to
making object IDs:

  * https://ap1.example.org/2024-03-14_making-a-post”
  * https://ap2.example.org/2024/03/14/at-a-conference
  * https://ap3.example.org/@a_user/112129026987774744
  * https://ap4.example.org/item/bbe54933-6748-492c-bed9-a2a15aeba523
  *
https://ap5.example.org/@this_user/notes/0b7939b7-e045-49d9-9c27-1cd0cfa23522

3.  That some  servers , having one of these systems to generate object
IDs,  may have trouble trying to copy object IDs generated by  some other
approadch.  Servers that rely on UUIDs for uniqueness will find that paths
ending in “2024/04/15/a-post” don’t have the right uniqueness
characteristics for their situation.  Servers that organize object IDs into
meaningful path segments may have trouble having some object IDs with pure
meaning-free identifiers  fit in their systems.

*PARTIAL BENEFIT sidebar*

Clearly some servers could conceivably port object IDs.  What if it is easy
to port over object IDs in some cases?  My post /@lisa/2024-03-14/xyz could
be ported to a whole new server and now be /@lisarue/2024/03/14/xyz.   Is
that the "same" object ID? Or if servers with similar object IDs decide to
map “2024/03/14/at-a-conference” to “2024-03-14_at-a-conference”, those two
approaches being very similar, that seems like it would be fine.  But what
would it gain?  Is there a benefit to copying Object IDs when objects are
copied if it can only be done sometimes?  or if sometimes it's not
identical but "inspired by" ?   What would be the benefit?  Can this be
left up to servers to decide whether to do or not?

*OPTIONS*

OPTION 1 - A New Consistency
  * We could attempt to enforce a new consistent approach to object IDs.
This would have to have some pretty strong reasons to do this, because we
already have diversity of approaches, widely distributed.

OPTION 2 - Enfolding
  * We could attempt to require the new server to use the old Object ID as
part of its new object ID.  Since in a new system the old object ID might
not be unique, the requirement would have to allow the new system to add
uniquifying material to the old ObjectID.
 * E.g. a server with numeric object IDs maps "
https://myap.example.org/2024/02/14/at-a-conference" to "
https://newserver.example.org/@newuser/112129026987774744/2024/02/14/at-a-conference".
 The server could probably make that work by dropping the end bit when it
receives it and focusing on the numeric ID to look up the activity.
* Conversely, a server running path-like Object IDs would map "https//
oldap.example.org/@olduser/0b7939b7-e045-49d9-9c27-1cd0cfa23522" to "
https://newap.example.org/@newuser/2024/2/14/0b7939b7-e045-49d9-9c27-1cd0cfa23522
".
* What benefit would this have over breadcrumbs (see below)?  It seems hard
to reliably extract the old object ID even when its enfolded in the new
one.

OPTION 3 - Changing IDs
  * We could allow the new server to pick its own object ID following its
own approach.  Along with this…

OPTION 3a). Changing IDs but with breadcrumbs
    * We could ask the new server to remember the old object ID and provide
it as a breadcrumb:  a list field on the new object would contain prior
object IDs. This allows an agent that discovers the object in its new
location to correlate it with the old object.  For example, an agent that
allows Liking a post could discover the new post in the new location
(probably because the agent's user decided to follow the user in their new
location and download their activity data.).  Thanks to the breadcrumbs,
the agent can tell that their user already Liked that post in its old
location, and show that in the GUI.

OPTION 3b) Changing IDs with item-by-item forwarding
    * We could ask the origin server to provide redirects for ANY object or
activity specifically to the new object or activity location.  Is this
reasonable? The origin server would have to get a list of these from the
destination server to even be able to host this.

OPTION 3c) Changing IDs but with lookup service
    * We could ask the new server to provide a lookup service.  Thus, if an
agent tried to follow an account move from server A to server B, and wanted
to find a new URL for a specific activity, it could use the lookup
service.  To make this work, the new server would need to maintain a map of
old/new object IDs.
* Benefit of this: when an agent that shows what posts the user Liked
before wants to show a post (e.g. the user is browsing through old Likes,
looking for something they want to forward), the agent could discover that
the post has moved by asking the original server.  Then going to the new
server the agent would ask "I'm looking for object ID XYZ, do you have a
new location for it?" and if the new server does indeed have that, the
querying agent has its answer, and the user can continue passing this fun
thing along.

The above options 3a, 3b and 3c are all compatible with each other, they
can be additive rather than alternative, but they all impose some cost.

My biggest questions now that I've laid this out:
* What assumptions are wrong?
* What options did I not think of?

Lisa






On Tue, Apr 16, 2024 at 9:19 AM Aaron Gray <aaronngray@gmail.com> wrote:

> Great, I am flexible timewise.
>
> I would like us to be able to spend some time to at least consider this
> within the wider context of :-
>
> a) Data synchronization for redundant storage across multiple servers.
> b) data transfer, transportation, and backup, and security there of
> c) backend cloud storage systems, and W3C SOLID POD's
> d) seeing users data as theirs and as a whole
> e) the ramifications and possible connections of these on LOLA
>
> So if we can add these as a semi session to the agenda as I think they are
> all highly related and need to be at least beared in mind, at most part of
> an integrated interoperable set of standards.
>
> Regards,
>
> Aaron
>
> On Tue, 16 Apr 2024 at 16:05, Lisa Dusseault <lisa@dtinit.org> wrote:
>
>> Friday's great with me -- Evan as another early replier can you also make
>> friday apr 26 8am PST?
>>
>> On Tue, Apr 16, 2024 at 3:29 AM Andy Piper <andypiperuk@gmail.com> wrote:
>>
>>> I'd prefer the Friday (my Thursday is booked up), but I don't need to be
>>> a blocker to a call.
>>>
>>> On Mon, Apr 15, 2024 at 9:25 PM Evan Prodromou <evan@prodromou.name>
>>> wrote:
>>>
>>>> I can be there!
>>>>
>>>> On 2024-04-15 2:55 p.m., Lisa Dusseault wrote:
>>>> >
>>>> > Hi folks,
>>>> > I've been making some progress on a proposal for moving a "Live
>>>> > On-Line Account" (LOLA) from one server to another server. As such it
>>>> > obviously doesn't meet all use cases, but it seems pretty doable and
>>>> > the use cases it does do seem commonly enough experienced.
>>>> >
>>>> > Shall we meet again next week? I would like to talk through how
>>>> object
>>>> > identifiers are generated and used by the servers that host content,
>>>> > and how they're used by the other parties that access that content.
>>>> I
>>>> > believe I've read the appropriate specs (though happy to have things
>>>> > pointed out to me) but there's always facts on the ground beyond the
>>>> > specs which are really useful to know.  I'll share some thoughts on
>>>> it
>>>> > on this list, but I'd like to know more about what's already done and
>>>> > what are possible approaches.
>>>> >
>>>> > Who'd like to talk through this topic and others relevant to
>>>> > solving portability use cases? Next week, Thursday and Friday am PST
>>>> > are good times for me.  How about Thursday April 25, 8am PST as a
>>>> > proposal - LMK if you can make that time, or if there's a better time
>>>> > for you and you'd like to join.
>>>> >
>>>> > Lisa
>>>>
>>>>
>>>
>>> --
>>> Andy Piper | Kingston upon Thames, London (UK)
>>> links: https://andypiper.me  | fediverse: @andypiper@macaw.social
>>> <https://macaw.social/@andypiper>
>>>
>>
>
> --
> Aaron Gray - @AaronNGray@fosstodon.org
>
> Independent Open Source Software Engineer, Computer Language Researcher,
> Information Theorist, and Computer Scientist.
>
>
Received on Wednesday, 17 April 2024 00:53:47 UTC