Fwd: [Social-discuss] Protocol / Design Considerations.

Interesting post from Blaine on the GNU Social list...

Dan

---------- Forwarded message ----------
From: Blaine Cook <romeda@gmail.com>
Date: Tue, Mar 16, 2010 at 9:30 PM
Subject: [Social-discuss] Protocol / Design Considerations.
To: social-discuss@nongnu.org


Hi all,

tl;dr: Please skip to the bottom if you're not going to read all of
this, or only read part of it. A summary is included.

I was referred to this list by Micah Anderson, and I'm very glad to
see the work happening here. It's important, and a diversity of groups
tackling the problem is important for everyone. I wish I could make it
to Libre Planet this week, but I live in Northern Ireland, and my
schedule wasn't conducive to a last minute 4-day trip. I hope this
email will serve instead.

First, by way of introduction, I'm currently employed by BT, working
on exactly this problem. It's worth emphasizing in this context that
my mandate is to produce open source software, and that I'm not
developing internal software for any of BT's business units. More
notably, however, I was the lead developer at Twitter in the early
days. There, I strongly pushed for an open (re: decentralized /
federated) approach to Twitter's network, on the assumption that
(especially in the long run) open networks like status.net (which
didn't exist at the time) are better for everyone, businesses,
non-profit organizations, and individuals alike. To that end, I built
into Twitter features that made it possible to interoperate with
Jaiku, and Ralph Meijer (then at Jaiku) did the same. Unfortunately,
those features never launched, and here we are three years later,
still without a major vendor pushing an open-access network. Well,
except for Google. ;-)

Since this project is about code and standards, and about moving
towards a workable environment that supports the sort of social
networks that we've seen emerge over the past decade, I'd like to
offer some observations and recommendations regarding technology
choices at the outset. Some of these are obvious, some are not, all
are hopefully useful.

Whatever technology / technologies are chosen, they must fulfill the
following pre-requisites:

1. Decentralized.

>From telephone switches, to email and the web, every example of a
communications technology which has achieved *universal* adoption is
decentralized, with global routing enabling numerous operators to
provide overlapping service to any segment of the population while
cooperating with other operators. This shouldn't be controversial,
since this is the point of this group. ;-)

2. Addressable.

You can't have the first requirement without this, so in some sense
this is a corollary to #1, but important enough to stand alone.
Herewith a controversial (but carefully considered) argument:

HTTP URLs don't work as addresses *for people*.
Email-style URIs do work as addresses *for people*.

Therefore, the addressing system should use email-style addressing,
c.f. webfinger (see http://webfinger.org or
http://code.google.com/p/webfinger ). I'll leave that for now, and
hopefully we can productively debate the point.

Addressing must be verifiable - that is, if one party sends a message
from a given address, the recipient must be able to verify that the
address matches the identity; the strength of that verification is
subject to limits, of course, but the ability must be present (note
that this is something that in the general case HTTP does not do
alone).

Addressing must also be symmetric, by which I mean that a single
address format and instance must convey both source and destination
information. Email, postal addressing, and phone numbers all satisfy
these requirements.

3. Data Agnostic.

We cannot predict the data formats of the future (we can't even
properly manage the data formats of the present!). Transport
mechanisms must be data agnostic. In my view, this is where protocols
like OStatus fall short, since despite being important
stepping-stones, they are inherently limited to a single data type.

4. Privacy-preserving.

While I've often pondered the changing (or is it?) nature of privacy,
having created one of the largest privacy-destroying tools ever, it's
still true that privacy is of immense importance. Every successful
social network site sees a very significant amount of private or
non-public usage, including Twitter (through direct messages and
private accounts). Any decentralized protocol or system that we
embrace must also be able to provide private messaging features. It's
my contention that Atom and feed readers have failed as widespread
social technologies because there is no simple way to securely
provision access to private feeds (i.e., capability URLs or "casual
privacy" don't cut it in this context for a number of reasons).

5. Low-latency.

Email has long been a low-latency communications medium, and virtually
every successful social network is predicated on low-latency sharing
of information. Whatever decentralized protocols and technologies are
adopted, they must support this form of communication. It's worth
noting that HTTP polling does not have this property; even Twitter is
a fundamentally different experience now than it was when many users
had XMPP support enabled (1 minute != 0.5 seconds).

6. Approachable.

A bonus requirement, but of no lesser importance. The technologies in
question must be approachable by both users and developers in order to
gain any adoption. I'd argue that for users to adopt the technology,
it must be indistinguishable from magic; that is, it must operate as
closely to how existing social networks operate as possible. What will
drive developers to adopt something is a much thornier question. In
the end, it may be marketing that wins, rather than strict suitability
or "goodness", but my hope is that the most adaptable and readily
implemented technology will win.

I built the Twitter <-> Jaiku federation on top of a simplified
profile of XMPP XEP-60. I love XMPP, and think it has all of the above
properties in spades. Except, perhaps, the last one; developer
adoption has been incredibly difficult to obtain for XMPP, and it's
unclear whether developers will ever embrace it. To that end, my
current "best-practice" stack is limited to PubSubHubbub (which
closely resembles my simplified profile of XEP-60) and Webfinger (for
addressing). I would strongly recommend that this group adopt those
two technologies in particular. For what it's worth, they form the
basis of the OStatus protocol spoken by status.net installs, and form
the basis of Google Buzz's API.

Atom is probably the most important data format to get started. Right
now, PubSubHubbub only specifies how to syndicate Atom data, but I
have it on good authority that arbitrary data types are coming in the
next revision of the spec. Other important emerging standards are
Activity Streams and the Salmon Protocol, but I'd argue that it's best
for end-user application developers to tackle those, rather than
gateway developers.

On that note, is the goal of this project to develop end-user facing
platforms, or gateways (like postfix or apache, versus wordpress or
flickr)?

Thanks for reading this far. This is a brief introduction to my
thoughts on the subject, and I haven't yet had a chance to properly
read through the list archives (though based on a preliminary
skimming, I would caution that approaches like FOAF are not well
aligned to existing approaches to social networks, emphasizing the
representation of networks, rather than facilitating rich
communication).

I (and many others in the PSHB, XMPP, Salmon, Webfinger, Atom,
OStatus, and OAuth communities, to name a few) have ideas on how the
privacy preserving aspects of these decentralized networks can work,
but (unlike the core PSHB protocol and Webfinger) they're not
implemented and in production anywhere yet, so I'll leave them for
another thread.

tl;dr:

I care about this stuff, I've been thinking about it a long time, and
this is what I think this group should focus on:

- PubSubHubbub + Webfinger with upgradability to XMPP (latter is optional).
- Simple email- and http-like servers acting as gateways (c.f.,
postfix, apache).
- Security is a grey area so far, good options exist, further discussion
needed.

- Core data format is Atom to start, with applications (i.e., *not
gateways*) focusing on adding support for Salmon and Activity Streams
data formats.
- Microformats and RDF* will continue to battle things out, with
de-facto winners being declared on a genre-by-genre basis (e.g.,
location, photos, etc).

I'd love to learn more about this group, and see if we can build
towards its goals in a way that incorporates the corporations that are
leading the pack, as well as empowering the individuals and smaller
organizations that have thus far been shut out of the process. I will
not be at all put out if you tear this to shreds; it's early days, and
we all have a lot to learn and build.

cheers,

b.

Received on Wednesday, 17 March 2010 13:07:31 UTC