LDP model and AtomPub model; an attempt from Wilde, Erik on 2012-11-09 (public-ldp-wg@w3.org from November 2012)

From: Wilde, Erik <Erik.Wilde@emc.com>
Date: Thu, 8 Nov 2012 20:28:37 -0500
To: "public-ldp-wg@w3.org" <public-ldp-wg@w3.org>
Message-ID: <CCC19C45.BC8F%erik.wilde@emc.com>
hello all.

since we are still going through discussions about what the actual model
of LDP is, and seem to agree that we should have a clearer description of
the model (whatever we end up with as the final model), here is one
attempt to start a discussion. this is my brief attempt to summarize the
atompub model, without going into any of the syntax details. you could
also see this as an attempt to create some structure for describing such a
model, which might help us to have a more structured discussion about what
we want to have for these different aspects. this is a quick and dirty
exercise and definitely not an attempt to be a 100% accurate reference
model for atompub.


- service basics: any atompub server exposes a "service document"
(http://tools.ietf.org/html/rfc5023#section-8) that describes what the
server is exposing. each service document lists a set of "workspaces"
(http://tools.ietf.org/html/rfc5023#section-8.1), which are just a
grouping construct for "collections". workspaces have no interaction
semantics in atompub, there is no protocol for creating or deleting them;
they just exist. each workspace lists a set of "collections"
(http://tools.ietf.org/html/rfc5023#section-10), which is by far the most
central construct in atompub. a collection can be listed in more than one
workspace.

- collection basics: a collection has a URI where you can start
interacting with the collection, at the most basic level you can GET a
list of members (http://tools.ietf.org/html/rfc5023#section-10), and very
likely this is somehow paginated
(http://tools.ietf.org/html/rfc5023#section-10.1). through its listing in
a service document, the collection exposes some interaction information,
such as which kind of mediatypes it will accept
(http://tools.ietf.org/html/rfc5023#section-8.3.4), and how members of the
collection might use categories
(http://tools.ietf.org/html/rfc5023#section-8.3.6) to be classified.

- creating collection members: any kind of resource can be listed in a
collection (http://tools.ietf.org/html/rfc5023#section-4.2), and
everything that is in a collection is represented by an entry. entries
follow the standardized metadata model defined by atom
(http://tools.ietf.org/html/rfc4287), but atompub distinguishes two kinds
of entries. if a client POSTs an "entry resource" (an atom entry following
atom's metadata model), the server pretty much takes this entry resource
and starts listing it in the collection as a member. if a client has
POSTed a "media resource" (pretty much anything that's not an entry by
itself, often something like an image media type), which has to be in one
of the accepted media types of the collection, the server accepts this
media resource and creates a "media link entry", which will then represent
this media resource when you list the collection contents. servers might
try to be smart about populating metadata fields in the entry and for
example look at exif data to populate certain fields. the interesting
aspect of this setup is that you POST one thing, and create two resources
(http://tools.ietf.org/html/rfc5023#section-9.6), and big media files
might for example get added to a CDN and get a CDN URI, whereas the entry
gets some URI under the control of the atompub server.

- member identity: a collection is a representation of members, and thus
contains a list of entries representing each individual member. according
to the atom model, each entry MUST have an <id>
http://tools.ietf.org/html/rfc4287#section-4.2.6, which is a URI but has
no interaction semantics (specifically, best practice suggests that
minting URIs that are not actionable might be a good idea
http://web.archive.org/web/20110514113830/http://diveintomark.org/archives/
2004/05/28/howto-atom-id). entries may have embedded content, or may link
to the content they are representing
(http://tools.ietf.org/html/rfc4287#section-4.1.3). identity is
established by the entry <id>, and this is particularly important in
scenarios where collections may be aggregated and filtered and repurposed:
entry identity must always be visible in the <id>, and thus identity can
be tracked across paths where entries may get repurposed in various
collections.

- interacting with members: members are listed when GETting a collection,
and their identity and metadata about them is exposed through regular atom
mechanisms. if members are editable, an "edit" link in the entry
(http://tools.ietf.org/html/rfc5023#section-11.1) will allow clients to
update the member entry, by using this link to PUT or DELETE the entry
resource. if the entry is a media link entry, then there might be a
"edit-media" link in the entry
(http://tools.ietf.org/html/rfc5023#section-11), which will allow clients
to update the media resource, by using this link to PUT or DELETE the
media resource. this model allows clients to both interact with a media
resource's metadata (the "media link entry"), and the media resource
itself.


- interacting with collections: like workspaces, collections just exist,
and atompub does not define how to create or delete them (i am currently
working on a small addition to the spec that addresses that). also,
collections have no structure, they have a URI and accept entries. this
means there is no hierarchy to collections, it's a flat space.


ok, that's it. this leaves out some details (such as most of the category
stuff), but should give you some idea what the data and interaction model
is. i'll be more than happy to fill you in with any level of detail, if
you're interested. i think we need to come up with something exactly like
this: describing basic concepts, their models, how all of these relate,
and how you can interact with a server exposing these to access and manage
instances of those concepts. i have left out pretty much all XML and HTTP
details, but these really are irrelevant at this level.

personally, i think that a lot of the terminology in atompub has been
chosen is unfortunate, illogical, and hard to remember, but that's besides
the point. just accept the terms as they are for now, and i'd be more than
happy to come up with better terminology.

cheers,

dret.
Received on Friday, 9 November 2012 01:29:26 UTC