- From: Nathan <nathan@webr3.org>
- Date: Tue, 17 May 2011 19:07:52 +0100
- To: glenn mcdonald <glenn@furia.com>
- CC: Manu Sporny <msporny@digitalbazaar.com>, Linked JSON <public-linked-json@w3.org>
glenn mcdonald wrote: > It seems to me that what we ought to be starting with, here, is a clear, > unencumbered, useful, standard way of representing graphs in JSON. Linking, > merging, mapping, federation: all this stuff comes later, in layers, over a > solid core graph-represenatation. First we need to do for graphs what CSV > did for tables. We don't *have* to do it in JSON, but I don't see why we > shouldn't. > > I also believe a few more things that may be non-obvious and/or > controversial: > > - lists are a native logical data construct, and should be integral to a > graph representation > - graphs describe the relationships between concepts; literals describe the > assignment of symbols to those concepts: these are two fundamentally > different frames of reference, and shouldn't be intermingled > - eliminating any uncertainty about the directionality of relationships, for > the consumer of a graph, is worth imposing the assumption/burden of > bi-directional relationship-maintainance on the underlying data system > > So here's the JSON approach this leads me to. A dataset is a bunch of data > points, and each point is a JSON object like this: > > { > "ID": 102, > "Name": "Nightwish", > "Arcs": { > "Type": [5], > "Album": [134,167,189,203,214], > "Genre": [74], > "MusicBrainz ID": [540] > } > } Just a quick question, how would you handle Arcs that are also points, for example the following in RDF </foo> </bar> "Baz" . </bar> x:label "Bar" . ? > So: > - each point has a numeric (relative) ID > - each point has an optional Name literal (and might have other literals > like a machine-readable Value, alternate languages, etc.) > - each point has a set of arcs > - each arc has a name and an ordered list of target-points, specified by > (relative) ID > - every data point has "Type" among its arcs > - every conceptual entity is represented by a data point; note that this > includes the MusicBrainz ID here: 540 is the ID of the local data-point > representing that external ID, which might in turn look like this: > > { > "ID": 540, > "Name": "00a9f935-ba93-4fc8-a33a-993abe9c936b", > "Arcs": { > "Type": [46], > "Artist": [102] > } > } > > Requiring this to be a point ensures that it has a type, and a local ID, and > can thus be differentiated, both structurally and individually, from any > other use of that same string. > > A simple dataset is then just an array of its points. That's it. Now we can > share graphs. > > A more complex dataset might embed that array in a metadata object with some > other context, like: > > - a base IRI, for turning these local IDs into IRIs > - some mappings of these local arc-names (usually scoped by type) to IRIs, > like Artist.Album to <http://musicgeek.com/ontology/1.0/release> > > but that's all external to the graph itself, and can be discarded or > replaced by the consumer of the data (e.g., I want to map that "Album" arc > to <http://musicnerd.com/datamodel/collection> instead). > > > > Notes and Disclaimers: > > - I take no offense if people are attached to precedents that this approach > discards, and thus consider it unresponsive. > - I'm wrting this in my personal capacity as an interested bystander. I'm > not on any standards committees and am not expressing a corporate point of > view. > - But I'm the same person when I go to work, which in my case is at Google > (via the recent acquisition of ITA Software), where the scheme I describe > here is mostly the same as the data-model/JSON-representation used by Needle > (www.needlebase.com), the graph-database project for which I'm the designer. > > glenn >
Received on Tuesday, 17 May 2011 18:08:40 UTC