RDF canonicalization [was Re: deterministic naming of blank nodes]

Hi Aiden,

On 05/13/2015 02:14 AM, ahogan@dcc.uchile.cl wrote:
> Aidan Hogan. "Skolemising Blank Nodes while Preserving Isomorphism". In
> WWW, Florence, Italy, May 18–22, 2015 (to appear).
> Available from: http://aidanhogan.com/docs/skolems_blank_nodes_www.pdf
>
> . . .  the paper presents an algorithm for deterministically
> labelling blank nodes of any RDF graph such that the labels of two
> isomorphic RDF graphs will "correspond".

Excellent!  It would be really good if we could define a W3C standard 
N-Triples and N-Quads canonicalization, for a variety of purposes.

But I have a question about your algorithm.  For greatest utility, a 
canonicalization algorithm should be robust against small changes in a 
graph.  In other words, if graphs G and H are very similar then their 
canonicalizations should also be very similar.  (By "very similar" I 
mean that G and H have large subgraphs G' and H' that are isomorphic.)

How robust is your algorithm to small changes in a graph?  Do you have 
any data on how much the canonicalization changes when a graph is changed?

Thanks,
David Booth

Received on Thursday, 14 May 2015 04:21:17 UTC