- From: Pierre-Antoine Champin <pierre-antoine@w3.org>
- Date: Fri, 13 Sep 2024 16:18:54 +0200
- To: Semantic Web <semantic-web@w3.org>, Felix Sasaki <felix.sasaki@sap.com>, RDF-star WG <public-rdf-star-wg@w3.org>
- Message-ID: <9fdff3c7-a8b3-474f-8b30-f4853816a04b@w3.org>
Dear all, yesterday during the RDF-star working group call, I mentioned that RDF canonicalization [1] can be used to build a crude RDF "diff" tool, and that I was using a small script that I wrote for that. Other participants expressed interest for this script, so I cleaned it up a bit and published it here: https://gist.github.com/pchampin/7017fa5ff607e5bedf65e2f9954cfd46 As indicated at the top, it relies on my Sophia library [2] for parsing and canonicalizing, but it can be easily adapted to use other command-line tools (for a while, I was using Gregg Kellogg's Ruby implementation [3]). Note that I describe it as a *crude* tool because - if the two graphs/dataset are isomorphic (i.e. identical modulo blank node labels), it will show no difference, - BUT if there is only the slightest difference, the tool may report a lot of changes, not all of them relevant. This is due to the fact that even a small difference can cause the canonicalization to relabel blank node in a completely different way. So even blank nodes that were not impacted by the change may end up with different names, and so the text diff applied to the canonical form will report those as changes. But despite these "false positives", I find it quite useful, and you might too. In particular, if the changes only impact triples/quads on IRIs and literals, the diff will be "exact". best [1] https://github.com/w3c/rdf-canon [2] https://github.com/pchampin/sophia_rs [3] https://ruby-rdf.github.io/
Attachments
- application/pgp-keys attachment: OpenPGP public key
Received on Friday, 13 September 2024 14:18:59 UTC