- From: Manu Sporny <msporny@digitalbazaar.com>
- Date: Sun, 04 Dec 2011 14:16:49 -0500
- To: public-webpayments@w3.org
On 12/01/2011 12:33 PM, Steven Rowat wrote: > On 11/30/11 9:43 PM, Manu Sporny wrote: >> Graph Normalization >> http://json-ld.org/spec/latest/rdf-graph-normalization/ >> >> [snip]... We express assets and listings on the PaySwarm network as >> a graph of information (in the mathematical sense). > > Manu, "The PaySwarm Ecosystem" is a great overview. Thank you very > much. Good, glad it was helpful. :) > The only element I don't understand is the Graph. Not the Graph > Normalization -- I believe I understand the need to Normalize so > you're not comparing apples and oranges. Ah, yes.. sorry I skipped over that, more below. > But, what's the Graph used for in the first place? The Graph is basically our fundamental data structure in the Web Payments work. The graph data structure used to express all assets, listings, licenses, transactions, etc. We need to be able to model how data exists on the Web - as a series of interconnected nodes, which is exactly what a graph allows us to do. > At the same high level of abstraction that you summarized the other > elements of the ecosystem, could you summarize that? (I tried reading > in the spec link you provided, but it's at too detailed a level for > me). Sure, I'd be happy to - as I mentioned previously, the system that we're creating is inherently decentralized. That is, the way we store data in this system is a bit of a step away from how services are typically designed on the Web (centralized). Think about Facebook or Twitter - both systems are data silos, all data exists on their cluster of computers. They use centralized, relational databases like MySQL to store information. If you were to take one of these databases, you would have to have a large amount of intricate knowledge about how the databases are designed and implemented in order to make sense of the data in the database. With a decentralized system, we want the fundamental data expressed via the system to not only be inter-operable with all systems, but linked with each other across systems. That is, we want the data to be fundamentally portable. We also want the data structure to be very extensible, because we want to support innovation at the edges of the network. Graphs accomplish all of this for us. To give an example of this, let's look at some data that we might store in a centralized database: id 12345 name Steven Rowat knows 56789 This is fundamentally non-portable, but serves centralized systems very well because the program knows how to interpret "id", "name", and "knows" and the values associated with them. All data is linked to other data in the centralized database, but the links don't really go out to the Web. To contrast, a decentralized design would do this instead: id http://steven.example.com/12345 name Steven Rowat knows http://manu.foo.com/56789 Note that the data ("id" and "knows") now point to things /outside/ of the database in which it is stored. Also note that we use URLs to identify things uniquely instead of just numbers. The second problem that needs to be solved is what "id", "name" and "knows" means. To solve that problem, we tell any program looking at the data structure above that it should use another document to interpret what those words mean, like so: @context http://vocabularies.org/Person id http://steven.example.com/12345 name Steven Rowat knows http://manu.foo.com/56789 The @context value basically tells us how to interpret the keywords "id", "name" and "knows"... so any system interpreting the data above can do so without any ambiguity and thus, the data is self contained and can live fully on the Web in any system. While we're using a key-value data structure above, what we're really expressing is a graph of information. To visualize what these graphs may look like, check out the following diagram: http://www.w3.org/TR/rdfa-primer/diagrams/image-about.svg The diagram above shows a graph with 3 nodes and 2 links. There is also only one way to interpret that graph by a program - the data is portable by design and thus can be expressed just about anywhere on the Web. So, the type of graph that we use in the Web Payments work is basically a collection of information that is interlinked with itself and other data out on the Web. The information is portable from one system to another without requiring any sort of transformation program. It is flexible because to add more information, we just add more nodes and links to the graph - both of which are identified by URLs. That is an explanation of what I mean by "graph" - does that make sense? > In other words, in a given simple payment transaction, what does the > graph do, exactly? In words that my mother might understand? :-) So, what the graph does is allow us to express a simple payment transaction in a way that is portable across all systems on the Web. We care about interoperability and therefore have to design the system with interoperability in mind - the types of graphs that we have designed give us interoperability. We are also building the system for the Web, the Web is basically a collection of a large number of information nodes that are connected to one another (web pages link to other web pages), and thus the fundamental data structure of the Web is a graph. We want the system to work well with the Web, so we should align data structures and not fight the Web's incredibly scalable architecture. To give you an example of what this looks like in practice, this is a mock-up of a simple payment transaction: @context http://purl.org/payswarm id http://blue.example.com/transactions/12345 from http://blue.example.com/i/manu/accounts/primary to http://green.foo.com/i/steven/accounts/pizza currency USD amount 12.35 Does all of that make sense? -- manu -- Manu Sporny (skype: msporny, twitter: manusporny) Founder/CEO - Digital Bazaar, Inc. blog: The Need for Data-Driven Standards http://manu.sporny.org/2011/data-driven-standards/
Received on Sunday, 4 December 2011 19:17:23 UTC