- From: Dave Longley <dlongley@digitalbazaar.com>
- Date: Mon, 27 Jun 2011 13:51:21 -0400
- To: public-linked-json@w3.org
One of the concerns raised over bnode support in JSON-LD has been the complexity of canonically naming bnodes in normalized JSON-LD output. I offered a rough, preliminary algorithm for canonicalizing bnode names a little while back, but I have since then simplified it. I believe the current iteration to be straight-forward to implement. The algorithm makes use of memoization techniques to reduce repeated comparisons when determining isomorphisms amongst bnodes. In the end, the result is a fairly simple recursive algorithm that makes use of 5 functions as defined below. There are now two inter-operable implementations of the algorithm and some test cases for it. There is also a Python implementation in the works. JavaScript: https://github.com/digitalbazaar/forge/blob/master/js/jsonld.js C++: https://github.com/digitalbazaar/monarch/blob/jsonld/cpp/data/json/JsonLd.cpp Test Cases: https://github.com/digitalbazaar/forge/tree/master/tests/jsonld The test cases will eventually be refined and merged into a JSON-LD test suite repository. JSON-LD normalization algorithm: 1. Expand input (resolve keywords and CURIEs, etc). 2. Give unique temporary names to unnamed blank nodes. 3. Flatten input (remove embeds, etc.). 4. Create a memo for each bnode for storing bnode comparisons. 5. Sort blank nodes using DeepCompare. 6. Uniquely rename any blank nodes with a c14n prefix. 7. Canonically name blank nodes using DeepName. 8. Sort any property arrays that contain blank nodes. 9. Return all subjects, sorted by IRI, in an array. DeepCompare(bnodeA, bnodeB): 1. If bnodeB's IRI is in bnodeA's memo, return the memo value. 2. Do ShallowCompare. If the return value is non-zero, store it in bnodeA's memo and return. 3. Do DeepCompareEdges on the bnodes' bnode properties with a new isomorphism mapping. If the return value is non-zero, store it in bnodeA's memo and return. 4. Do DeepCompareEdges on the bnode's bnode references with a new isomorphism mapping. 5. Store the return value in bnodeA's memo and return. ShallowCompare(bnodeA, bnodeB): 1. The bnode with fewer properties is first. 2. The bnode with the alphabetically first property is first. 3. Do CompareObjects for each equivalent property. 4. The bnode with fewer references is first. 5. The bnode with the reference IRI (vs. bnode) is first. 6. The bnode with the alphabetically-first reference IRI is first. 7. The bnode with the alphabetically-first reference property is first. CompareObjects(bnodeA, bnodeB, property): 1. The bnode with fewer objects is first. 2. The bnode with fewer non-bnode objects is first. 3. The bnode with a string object is first. 4. The bnode with the alphabetically-first string is first. 5. The bnode with a @literal is first. 6. The bnode with the alphabetically-first @literal is first. 7. The bnode with the alphabetically-first @datatype is first. 8. The bnode with a @language is first. 9. The bnode with the alphabetically-first @language is first. 10. The bnode with the alphabetically-first non-bnode @iri is first. DeepCompareEdges(bnodeA, bnodeB, direction, iso): 1. Iterate over bnodeA's edges in the given direction (property or reference). 2. If an adjacent bnode IRI is in the iso mapping, make sure it occurs using the same property in bnodeB's edges. If not, the bnodes are not equal. 3. The bnode IRI is not in the iso mapping, so search bnodeB's edges for a matching property and adjacent bnode IRI that is also not in the iso mapping. If not found, the bnodes are not equal. If found, put the bnode IRIs into the iso mapping and run DeepCompare on the bnodes. If DeepCompare returns non-zero, remove the bnodes from the iso mapping, the bnodes are not equal. 4. If the bnodes are not equal, -1 if bnodeA is less and 1 if bnodeB is less. The lesser bnode is the one with the least adjacent bnode for bnodeA's edge. Use the same iso mapping when comparing adjacent bnodes. If the bnodes are equal, return 0. DeepName(bnode): 1. If the bnode's prefix is already c14n, return. 2. Rename the bnode using the c14n prefix and incremented counter suffix. 3. Recurisvely rename the bnodes from the bnode's properties in sort order. 4. Recurisvely rename the bnodes from the bnode's references in sort order. -- Dave Longley CTO Digital Bazaar, Inc. http://digitalbazaar.com
Received on Monday, 27 June 2011 17:51:57 UTC