LDP model and AtomPub model; more remarks from Wilde, Erik on 2012-11-10 (public-ldp-wg@w3.org from November 2012)

From: Wilde, Erik <Erik.Wilde@emc.com>
Date: Fri, 9 Nov 2012 20:16:14 -0500
To: "public-ldp-wg@w3.org" <public-ldp-wg@w3.org>
Message-ID: <CCC2E45E.B9FE%erik.wilde@emc.com>
hello.

trying to explain things more along the way how atom/atompub and the web
works, here a more targeted approach at addressing the
containment/aggregation debate, again purely from the atom/atompub model
point of view.

on the web, containment does not exist beyond resource boundaries.
resources are interlinked, and maybe some sets of resources are managed by
the same authority and could be manipulated in an atomic way, but you can
never say because URIs are opaque, and even hierarchical URI schemes
cannot guarantee anything.

atompub allows entry resources to be self-contained, because atom allows
entries to embed their content. in such a case, containment is a given
because the entry is one resource, and accepted and subsequently managed
by the atompub server. from the logical point of view, the exact same
information could be conveyed by the entry resource just containing the
entry metadata, and then having a content/@src link that links to the
actual content. the atompub server usually wouldn't care, it only cares
about managing entries, and the fact that in this case, entry content is
linked and not embedded is of no consequence to the server. in fact, in
such a case the entry resource becomes very similar to the "media link
entry" associated with a "media resource", only that in this latter case,
the atompub server actually accepted the content to be POSTed to a
collection, and then created the associated media link entry. as a user
not wanting to use the "media resource" facility of an atompub server, you
could always store your media resource anywhere you like, and then just
POST an entry linking to it as an entry resource. the only difference
would be that in that case, the atompub server would never serve an
"edit-media" link for the entry, because it wouldn't know how the media
resource would have to be edited.

you could envision atompub implementations that would enforce things
either way: they could only accept entries with embedded content (in which
case you might say that they always manage collections that "contain" the
entries), or they could only accept entries with linked content and/or
media resources (in which case you might say that they always manage
collections as aggregations, but media resources probably don't really
qualify here because they are still under the control of the same server).

since the linked content is just a link, in such a scenario the server
only controls the entry (and not the content), and thus the same content
can easily be added to various collections. also, when the content
disappears, the server doesn't know, so then you end up with "orphaned"
entries, but broken links happen in any decentralized scenario.

identity-wise, things are as following: when you just embed content, the
entry is added to the collection and gets a URI
(http://tools.ietf.org/html/rfc5023#section-9.2), but the content itself
does not have an identity of its own. in practice, what often happens is
that people embed the content (and some of you might remember the times
when some blogs annoyingly would only publish the first paragraph in their
feed, and then force you to actually go to their sites; luckily, most of
these blogs are gone by now), but also link to the identity on the web
with a rel="alternate" link
(http://tools.ietf.org/html/rfc4287#section-4.2.7.2). in that latter case,
while the content itself indeed does have identity of itself, it is copied
into the entry where it again assumes (as this copy) the identity of the
collection entry.

maybe this discussion helped a bit to guide our discussions around
containment and aggregation. to me, the most important issue is to accept
that the web itself has no containment outside resource boundaries, and
anything else is a function of servers managing URI spaces where authority
and management must be clearly communicated through interactions across
well-defined links, and not any kind rules based on URI paths.

cheers,

dret.
Received on Saturday, 10 November 2012 01:16:52 UTC