Re: The PaySwarm Ecosystem

On 12/01/2011 12:33 PM, Steven Rowat wrote:
> On 11/30/11 9:43 PM, Manu Sporny wrote:
>> Graph Normalization
>> http://json-ld.org/spec/latest/rdf-graph-normalization/
>>
>> [snip]... We express assets and listings on the PaySwarm network as
>> a graph of information (in the mathematical sense).
>
> Manu, "The PaySwarm Ecosystem" is a great overview. Thank you very
> much.

Good, glad it was helpful. :)

> The only element I don't understand is the Graph. Not the Graph
> Normalization -- I believe I understand the need to Normalize so
> you're not comparing apples and oranges.

Ah, yes.. sorry I skipped over that, more below.

> But, what's the Graph used for in the first place?

The Graph is basically our fundamental data structure in the Web
Payments work. The graph data structure used to express all assets,
listings, licenses, transactions, etc. We need to be able to model how
data exists on the Web - as a series of interconnected nodes, which is
exactly what a graph allows us to do.

> At the same high level of abstraction that you summarized the other
> elements of the ecosystem, could you summarize that? (I tried reading
> in the spec link you provided, but it's at too detailed a level for
> me).

Sure, I'd be happy to - as I mentioned previously, the system that we're
creating is inherently decentralized. That is, the way we store data in
this system is a bit of a step away from how services are typically
designed on the Web (centralized). Think about Facebook or Twitter -
both systems are data silos, all data exists on their cluster of
computers. They use centralized, relational databases like MySQL to
store information. If you were to take one of these databases, you would
have to have a large amount of intricate knowledge about how the
databases are designed and implemented in order to make sense of the
data in the database.

With a decentralized system, we want the fundamental data expressed via
the system to not only be inter-operable with all systems, but linked
with each other across systems. That is, we want the data to be
fundamentally portable. We also want the data structure to be very
extensible, because we want to support innovation at the edges of the
network. Graphs accomplish all of this for us.

To give an example of this, let's look at some data that we might store
in a centralized database:

id     12345
name   Steven Rowat
knows  56789

This is fundamentally non-portable, but serves centralized systems very
well because the program knows how to interpret "id", "name", and
"knows" and the values associated with them. All data is linked to other
data in the centralized database, but the links don't really go out to
the Web.

To contrast, a decentralized design would do this instead:

id     http://steven.example.com/12345
name   Steven Rowat
knows  http://manu.foo.com/56789

Note that the data ("id" and "knows") now point to things /outside/ of
the database in which it is stored. Also note that we use URLs to
identify things uniquely instead of just numbers. The second problem
that needs to be solved is what "id", "name" and "knows" means. To solve
that problem, we tell any program looking at the data structure above
that it should use another document to interpret what those words mean,
like so:

@context  http://vocabularies.org/Person
id        http://steven.example.com/12345
name      Steven Rowat
knows     http://manu.foo.com/56789

The @context value basically tells us how to interpret the keywords
"id", "name" and "knows"... so any system interpreting the data above
can do so without any ambiguity and thus, the data is self contained and
can live fully on the Web in any system. While we're using a key-value
data structure above, what we're really expressing is a graph of
information.

To visualize what these graphs may look like, check out the following
diagram:

http://www.w3.org/TR/rdfa-primer/diagrams/image-about.svg

The diagram above shows a graph with 3 nodes and 2 links. There is also
only one way to interpret that graph by a program - the data is portable
by design and thus can be expressed just about anywhere on the Web.

So, the type of graph that we use in the Web Payments work is basically
a collection of information that is interlinked with itself and other
data out on the Web. The information is portable from one system to
another without requiring any sort of transformation program. It is
flexible because to add more information, we just add more nodes and
links to the graph - both of which are identified by URLs.

That is an explanation of what I mean by "graph" - does that make sense?

> In other words, in a given simple payment transaction, what does the
>  graph do, exactly? In words that my mother might understand? :-)

So, what the graph does is allow us to express a simple payment
transaction in a way that is portable across all systems on the Web. We
care about interoperability and therefore have to design the system with
interoperability in mind - the types of graphs that we have designed
give us interoperability.

We are also building the system for the Web, the Web is basically a
collection of a large number of information nodes that are connected to
one another (web pages link to other web pages), and thus the
fundamental data structure of the Web is a graph. We want the system to
work well with the Web, so we should align data structures and not fight
the Web's incredibly scalable architecture.

To give you an example of what this looks like in practice, this is a
mock-up of a simple payment transaction:

@context  http://purl.org/payswarm
id        http://blue.example.com/transactions/12345
from      http://blue.example.com/i/manu/accounts/primary
to        http://green.foo.com/i/steven/accounts/pizza
currency  USD
amount    12.35

Does all of that make sense?

-- manu

-- 
Manu Sporny (skype: msporny, twitter: manusporny)
Founder/CEO - Digital Bazaar, Inc.
blog: The Need for Data-Driven Standards
http://manu.sporny.org/2011/data-driven-standards/

Received on Sunday, 4 December 2011 19:17:23 UTC