Re: Chartering work has started for a Linked Data Signature Working Group @W3C from Aidan Hogan on 2021-05-31 (semantic-web@w3.org from May 2021)

From: Aidan Hogan <aidhog@gmail.com>
Date: Sun, 30 May 2021 22:42:36 -0400
To: Harry Halpin <hhalpin@ibiblio.org>, Peter Patel-Schneider <pfpschneider@gmail.com>
Cc: Dan Brickley <danbri@danbri.org>, Semantic Web <semantic-web@w3.org>
Message-ID: <5f76d5ea-67a4-ca6c-b5e0-c161503aa39b@gmail.com>
On 2021-05-25 20:19, Harry Halpin wrote:
<snip>
> The "audits" that Manu that he paid for and referenced earlier by "Math 
> Ph.D.s"  like Mirabolic Consulting are basically amateur at best. No 
> actual cryptographer has bothered to look at this, and there are no proofs.

There are some specious claims here that I think need to be called out, 
given in particular that they baselessly smear other people's hard work 
and reputation on a public list.

- Mirabolic Consulting were tasked with reviewing a document [1] that 
describes the RDF Dataset Canonicalisation procedure. The document 
defines the procedure and provides proofs regarding some theoretical 
properties about the procedure. It is not true that "there are no 
proofs". There are nine proofs in the document that, when taken 
together, span various pages.

- The RDF Dataset Canonicalisation procedure relates primarily to the 
graph isomorphism problem and has nothing to do with cryptography other 
than it being an intended application. While it would be informative to 
have an "actual cryptographer" look at other aspects of the proposals in 
terms of how canonicalisation is *used* for signing and verification, 
such a person would not be qualified to review this particular document 
on the canonicalisation of RDF datasets itself, unless they also 
happened to have (or were willing to acquire) extensive knowledge of the 
graph isomorphism problem.

- Not only is the vehement characterisation of the 11 page review 
provided by Mirabolic Consulting [2] as being "basically amateur at 
best" left completely unjustified, I disagree with it in the strongest 
possible terms. While one might perhaps argue about the effect of being 
paid to review a document, and while I think there is still one aspect 
of one of the proofs and the review that I still struggle with, I think 
that the review is more incisive, and more detailed, than any academic 
peer review I have ever seen (and I have seen many). The author(s) of 
the review are clearly knowledgeable about the graph isomorphism problem 
(a deep technical issue in its own right, and one that I myself have 
worked pretty extensively on), and deep dive into the documents and its 
reference implementation. The last section on "Correctness by Design" 
happens to perfectly echo my own overall thoughts. I think that their 
review is excellent, thoroughly constructive, and thoroughly professional.

[1] 
https://lists.w3.org/Archives/Public/public-credentials/2021Mar/att-0220/RDFDatasetCanonicalization-2020-10-09.pdf

[2] 
https://lists.w3.org/Archives/Public/public-credentials/2021Apr/att-0032/Mirabolic_Graph_Iso_Report_2020_10_19.pdf

<snip>
> Second, the problem is fundamentally conceptual, and the fact that this 
> Working Group charter is so confused on basic concepts kinda proves it's 
> not capable of doing the work. You can only sign bistrings, i.e. syntax. 
> You can't sign "semantics".  The use-cases of "securing RDF" just makes 
> the above worse, even if you could get a functional canonicalization 
> algorithm (and I'm not convinced you can). 

There are again a couple of misconceptions:

- It is unclear why you are "not convinced you can" get "a functional 
canonicalization algorithm". It is widely established at this stage that 
you can. Even if you (again, without justification) reject the algorithm 
of Dave, Rachel and Manu as not being "functional", I have independently 
defined an algorithm for the same problem that has published twice [3,4] 
(granted that it is for RDF graphs and not RDF datasets, but the 
extension to datasets is trivial). Even if you reject my complete 
algorithm as not being functional, I describe a simple algorithm in 
[3,def 3.5] that can be described in a few lines and can be proved 
straightforwardly to be functional (though highly inefficient). Even if 
you reject all of our work, any expert on the graph isomorphism problem 
will tell you that the problem can be solved as a minor variant of the 
canonical labelling algorithms for graphs that have been around for decades.

- It is unclear what you mean by "signing semantics", or whose claim 
specifically you take exception with, but you can sign a dataset with 
respect to semantics if you can define a canonical form for the dataset 
modulo semantics. This would trivially allow one to define "bitstrings" 
that map one-to-one to equivalence classes of datasets under a given 
semantics (equivalence classes of semantic equivalence). In fact, in 
[4], I defined such a procedure for canonicalising RDF graphs modulo 
simple semantics: lean the graph first and then canonicalise it modulo 
(RDF) isomorphism. Such a procedure could feasibly be extended to other 
forms of semantics. In any case, "signing semantics" is not part of the 
charter in any way.

[3] Aidan Hogan. "Skolemising Blank Nodes while Preserving Isomorphism 
pdf". In the Proceedings of the 24th International World Wide Web 
Conference (WWW), Florence, Italy, May 18–22, 2015.
http://aidanhogan.com/docs/skolems_blank_nodes_www.pdf

[4] Aidan Hogan. "Canonical Forms for Isomorphic and Equivalent RDF 
Graphs: Algorithms for Leaning and Labelling Blank Nodes". In ACM 
Transactions on the Web 11(4): 22:1-22:62, 2017.
http://aidanhogan.com/docs/rdf-canonicalisation.pdf

Best,
Aidan


>      yours,
>       harry
> 
> 
> 
> 
> 
> 
> 
> 
>      > (Ceci n'est pas un audit de sécurité.)
>      >
>      > Best,
>      > Aidan
> 
>     peter
> 
> 
>
Received on Monday, 31 May 2021 02:43:53 UTC