C14N use case: version control from Jeremy Carroll on 2011-06-27 (public-rdf-wg@w3.org from June 2011)

From: Jeremy Carroll <jeremy@topquadrant.com>
Date: Mon, 27 Jun 2011 11:55:02 -0700
To: RDF Working Group WG <public-rdf-wg@w3.org>
Message-ID: <4E08D206.9020405@topquadrant.com>

This is the use case that TopQuadrant has internally that prompted 
discussion between me and Gavin leading to this thread on this mailing list.

A significant portion of our product source is in RDF.
We are migrating our version control system to GIT to reduce cost of merging
This will not work for RDF in the form that we currently store it, 
because simple changes result in completely different documents
We are now working on a version of my earlier paper with additional 
steps to insure reasonable stability of blank node IDs.

(In the terms of the paper the bnode ids will be based on a hashcode 
generated from the first distinctive triple for that bnode).

This will then give, in the vast majority of cases, small changes to the 
RDF will result in small changes to the canonical form (larger changes 
will occur at discontinuities in the hashing algorithm, when the number 
of buckets need expanding)

Jeremy

Received on Monday, 27 June 2011 18:55:34 UTC