Definition of Linked Data Platform as a finite state machine from Martynas Jusevicius on 2012-11-10 (public-ldp@w3.org from November 2012)

From: Martynas Jusevicius <martynas@graphity.org>
Date: Sat, 10 Nov 2012 13:12:43 +0200
To: public-ldp@w3.org
Message-ID: <CAE35VmxfybAdV1WcdYw=hThmW1SQDGRZ9SXa_MR6zZM4hGaeHw@mail.gmail.com>
Hey all,

I see some heated debates on the WG list about vocabularies, media
types, containers, base/relative URIs, URI templates, container
aggregation/composition and how these pieces combine in the LDP
puzzle.
I'd like to show a practical and consistent solution that I've came up
with building the Graphity LDP implementation. It stems directly from
source code, and taking the current LDP specification into account was
not a priority, so please bear with me.

To start with, lets define that a Linked Data Platform consists of:
- application state in a form of RDF ontology
- specification of vocabularies that define LDP state
- specification on how HTTP methods change the state
- rules on how to make LDP state consistent after changes

Initial platform state can serve as configuration ontology. Lets say
it is the following Bug ontology:

@base <http://localhost/> .

@prefix : <ontology#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix sioc: <http://rdfs.org/sioc/ns#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix g: <http://graphity.org/ontology/> .
@prefix lda: <http://purl.org/linked-data/api/vocab#> .

# LDP METADATA (SITEMAP ONTOLOGY)

# Containers

<bugs> a sioc:Container ;
	g:memberClass :BugResource .

<bugs/critical> a sioc:Container ;
	sioc:has_parent <bugs> ;
	g:memberClass :CriticalBug.

# Resources

<bugs/bug1> a :BugResource ;
	sioc:has_parent <bugs> ;
	foaf:primaryTopic :Bug1 .

<bugs/critical/bug2> a :CriticalBugResource ;
	sioc:has_parent <bugs/critical> ;
	foaf:primaryTopic :Bug2 .

# Classes

:BugResource a owl:Class ;
	rdfs:subClassOf [ a owl:Restriction ;
		owl:onProperty foaf:primaryTopic ;
		owl:allValuesFrom :Bug ] ,
		[ a owl:Restriction ;
		owl:onProperty sioc:has_parent ;
		owl:hasValue <bugs> ] ;
	lda:uriTemplate "/bugs/{id}" .

:CriticalBugResource a owl:Class ;
	rdfs:subClassOf [ a owl:Restriction ;
		owl:onProperty foaf:primaryTopic ;
		owl:allValuesFrom :CriticalBug ] ,
		[ a owl:Restriction ;
		owl:onProperty sioc:has_parent ;
		owl:hasValue <bugs/critical> ] ;
	lda:uriTemplate "/bugs/critical/{id}" .

# RDF CONTENT

:Bug a owl:Class .
:CriticalBug rdfs:subClassOf :Bug .

:Bug1 a :Bug .
:Bug2 a :CriticalBug .

You can see the ontology contains domain-specific RDF content (2 bug
classes, 2 bug instances) along with LDP-specific metadata (2
corresponding LDP containers, 2 LDP resource classes, and 2 LDP
resource instances). For practical purposes those two kinds of
metadata can be separated and stored in different locations, e.g.
memory and triplestore. We could define mechanisms on how to access
each of them separately, but in most cases it makes sense to combine
them into a single representation.

A combination of existing OWL, SIOC, FOAF, and Linked Data API
properties is enough to define the current state of LDP-specific and
domain-specific resources as well as restrictions on any new LDP
state. Containers are more naturally defined as classes than with
member properties. One of them serves instances of LDP class
(BugResource), the other one instances of domain class (Bug) -- this
is made different on purpose.
One missing piece of a puzzle is a property specifying the class of
container members, so I added this as g:memberClass.

Our LDP instance on http://localhost/ base URI serves descriptions of
itself. In other words, it returns descriptions of ontology resources
that match the Request-URI, e.g. as a result of DESCRIBE query.
Lets define a few use cases on how HTTP interacts with the state, as
request/response pairs with message body. They assume that no
inference is used and the returned message body (representation)
contains both the LDP-specific metadata and domain-specific RDF. All
localhost URIs are used in absolute form, other prefixes are omitted.

1.a Retrieving container and its members (LDP BugResource instances)

GET /bugs HTTP/1.1
Host: localhost

HTTP/1.1 200 OK

<http://localhost/bugs> a sioc:Container ;
	g:memberClass <http://localhost/ontology#BugResource> .

<http://localhost/bugs/bug1> a :BugResource ;
	sioc:has_parent <http://localhost/bugs> ;
	foaf:primaryTopic <http://localhost/ontology#Bug1>

1.b Retrieving container and its members (domain CriticalBug instances)

GET /bugs/critical HTTP/1.1
Host: localhost

HTTP/1.1 200 OK

<http://localhost/bugs/critical> a sioc:Container ;
	sioc:has_parent <http://localhost/bugs> ;
	g:memberClass <http://localhost/ontology#CriticalBug> .

<http://localhost/bugs/critical/bug2#bug> a
<http://localhost/ontology#CriticalBug> .

I added this example because it might not be practical to store all
:CriticalBugResource (and similar) instances. The container can
instead be serving :CriticalBug directly as its g:memberClass, but
then the URIs have to be designed accordingly (e.g. using # hashes as
in this case).

2. Retrieving LDP resource (BugResource instance)

GET /bugs/bug1 HTTP/1.1
Host: localhost

HTTP/1.1 200 OK

<http://localhost/bugs/bug1> a <http://localhost/ontology#BugResource> ;
	sioc:has_parent <http://localhost/bugs> ;
	foaf:primaryTopic <http://localhost/ontology#Bug1> .

<http://localhost/ontology#Bug1> a <http://localhost/ontology#Bug> .

Now comes the interesting part -- the HTTP methods that change the
application state. The method semantics are as usual: POST adds
resource with server-defined URI, PUT adds resource with
client-defined URI. I intentionally made the request examples such,
that directly added to the current ontology state they would make it
"LDP-inconsistent", giving a chance to define the possible LDP
consistency rules (i.e. that in general domain instances must have
corresponding LDP instances).

3. Creating new Bug (with user-defined URI)

PUT /bugs/bug3 HTTP/1.1
Host: localhost

<http://localhost/bugs/bug3#bug> a <http://localhost/ontology#Bug> .

By looking at the initial state we see that it contained LDP metadata
statements related to each bug, which are missing in the request. The
LDP workflow is following:
3.1. get the class of the posted resource (:Bug)
3.2. find restrictions on class :Bug (owl:allValuesFrom :Bug)
3.3. get the class for which the restriction is defined (:BugResource)
3.4. create an LDP resource instance of that class using Request-URI
(/bugs/bugs3)
3.5. add a statement relating the created LDP resource to the posted
resource using foaf:isPrimaryTopic
3.6. add a statement relating the created LDP resource to its
container (<bugs> defined in another restriction) using sioc:has_host

Using these rules, the LDP infers 3 additional statements that are
added to the state ontology, and combined with the request body are
returned as response:

HTTP/1.1 201 Created

<http://localhost/bugs/bug3#bug> a <http://localhost/ontology#Bug> .

<http://localhost/bugs/bug3> a <http://localhost/ontology#BugResource> ;
	foaf:primaryTopic <http://localhost/bugs/bug3#bug> ;
	sioc:has_host <http://localhost/bugs> .

If the last 3 statements would be contained in the request payload
then there would be nothing to infer since the state would be
LDP-consistent anyway.

4. Appending new CriticalBug (with container-defined URI)

POST /bugs/critical HTTP/1.1
Host: localhost

<http://localhost/ontology#Bug4> a <http://localhost/ontology#CriticalBug> .

Again, by looking at the initial LDP state we see that this request
would make it inconsistent. We can use similar rules to fix it (the
same rules as in section 3 are not repeated here, only arguments given
for each step):
3.1 (:CriticalBug)
3.2 (:AllValuesFrom :CriticalBug)
3.3 (:CriticalBugResource)
4.4. get the URI template of that class instances ("/bugs/critical/{id}")
4.5. create an LDP resource instance of that class using that URI
template with unique ID (/bugs/critical/123)
3.5.
3.6.

This yields the following response (and state):

HTTP/1.1 200 OK

<http://localhost/ontology#Bug4> a <http://localhost/ontology#CriticalBug> .

<http://localhost/bugs/critical/123> a
<http://localhost/ontology#CriticalBugResource> ;
	foaf:primaryTopic <http://localhost/ontology#Bug4> ;
	sioc:has_host <http://localhost/bugs/critical> .

5. Appending new CriticalBug to the wrong container

POST /bugs HTTP/1.1
Host: localhost

<http://localhost/ontology#Bug5> a <http://localhost/ontology#CriticalBug> .

Even if this is not the container CriticalBug is supposed to be
appended to, following rules from point for we get the correct state
anyway:

HTTP/1.1 200 OK

<http://localhost/ontology#Bug5> a <http://localhost/ontology#CriticalBug> .

<http://localhost/bugs/critical/345> a
<http://localhost/ontology#CriticalBugResource> ;
	foaf:primaryTopic <http://localhost/ontology#Bug5> ;
	sioc:has_host <http://localhost/bugs/critical> .

6. Appending Bug defined by different authority

POST /bugs
Host: localhost

<http://host.com/bugs/bug6#bug> a <http://localhost/ontology#Bug> .

This even works using the same rules from section 4 (except it doesn't
make sense to apply 3.6) and yields the following state/response:

HTTP/1.1 200 OK

<http://host.com/bugs/bug6#bug> a <http://localhost/ontology#Bug> .

<http://localhost/bugs/bug6> a <http://localhost/ontology#BugResource> ;
	foaf:primaryTopic <http://host.com/bugs/bug6#bug> .

7. Deleting a resource

DELETE /bugs/bug1

If LDP resource is deleted, related domain resource has to be deleted
as well, otherwise the LDP state would become inconsistent:
7.1. Remove descriptions of Request-URI resource (all statements where
it is subject) from the state ontology (<bugs/bug1>)
7.2. If the Request-URI resource had any resources related with
foaf:primaryTopic, remove them as well (:Bug1)

HTTP/1.1 204 No Content

You can find an ontology with all of the above RDF here:
https://raw.github.com/Graphity/graphity-ldp/master/src/main/webapp/bug-state.ttl

I hope the above makes sense. I'm sure I haven't covered all cases, so
feedback is welcome and please point out and/or fix any errors (as a
pull request on GitHub maybe?)

I think however this is a self-describing approach that closes the
read/write flow using only RDF (with established vocabularies), REST,
and HATEOS principles and shows that Linked Data (Platform) can be
implemented as a combination of those. I also think it solves most of
the discussed issues of the current LDP specification (not that I
manage to keep track of all of them).

As for the state machine, I didn't know I was building one until I
built it. The Graphity implementation follows this object-oriented
design:
http://en.wikipedia.org/wiki/Automata-based_programming#Object-oriented_programming_relationship

There is related research that backs up the REST and Linked Data state
machine approach:
"Toward Data-driven Programming for RESTful Linked Data" (Steen
Stadtmuller, Andreas Harth) http://www.inf.puc-rio.br/~psw12/7.pdf
"A finit-state machine approach for modeling and analyzing RESTful
systems" (Ivan Zuzak, Ivan Budiselic, Goran Delac)
http://ivanzuzak.info/papers/2011_FSMREST.pdf

The g:memberClass pseudo-property can be replaced with custom SPIN
vocabulary descriptions of the queries generating the representations,
making the application state even more explicit. You can see a working
example of such ontology here:
https://raw.github.com/Graphity/graphity-ldp/master/src/main/resources/org/graphity/ldp/vocabulary/graphity-ldp.ttl

I hope this helps to build robust LDP specifications.

Martynas Jusevicius
http://graphity.org
https://twitter.com/pumba_lt
Received on Saturday, 10 November 2012 11:13:11 UTC