- From: David Booth <david@dbooth.org>
- Date: Wed, 11 Jul 2018 15:08:44 -0400
- To: semantic-web@w3.org
On 07/11/2018 12:57 PM, Victor Porton wrote: > I am writing a program which takes decisions based on several RDF files > which it may download. > > How to make my program deterministic? (no change in the RDF files => no > change in program decisions) > > So I want to retrieve triples in a fixed ("deterministic") order, if > this is possible. > > I use Python with rdflib. Short answer: canonicalize your RDF files when you receive them, by parsing and re-serializing using a suitable tool. Then compare the newly receive canonical file with the previous canonical file, using standard text-based diff comparison, to find out if anything changed. Longer explanation: This is a weakness in standard RDF, and the origin of the problem is due to the semantics of blank nodes. Instead of being able to easily compare two RDF graphs for equality, as you can do in most data representations, in RDF you have to check for graph isomorphism, which according to wikipedia "is not known to be solvable in polynomial time nor to be NP-complete". (I don't know if rdflib offers a graph isomorphism function, but if so then you could use that.) This graph isomorphism problem is why no RDF canonicalization algorithm has been adopted as a W3C standard to date. However, most RDF graphs in practice do not cause the canonicalization algorithms to blow up. And if blank node usage is modestly restricted to avoid blank node cycles, then the canonicalization algorithms are guaranteed to be easy and fast. This is a direction that I advocate and described in "Well Behaved RDF: A Straw-Man Proposal for Taming Blank Nodes": http://dbooth.org/2013/well-behaved-rdf/Booth-well-behaved-rdf.pdf One bit of good news is that there has been significant progress in JSON-LD toward adopting a canonicalization standard, in part because it is also needed for digital signatures. A draft spec is here (though at the moment it is called "normalization" instead of "canonicalization"): https://json-ld.github.io/normalization/spec/index.html Unfortunately that document is out of scope for the current JSON-LD working group, so there is still no clear timeline for it to become a W3C standard: https://www.w3.org/2018/03/jsonld-wg-charter.html I hope that helps. David Booth
Received on Wednesday, 11 July 2018 19:09:13 UTC