Re: Thoughts on the LDS WG chartering discussion from Peter F. Patel-Schneider on 2021-06-10 (semantic-web@w3.org from June 2021)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Thu, 10 Jun 2021 06:05:32 -0400
To: semantic-web@w3.org
Message-ID: <e89284b0-7988-6fc0-1e8d-b4d6cd392252@gmail.com>

On 6/10/21 3:40 AM, Ivan Herman wrote:

[...]

>
> But. If I "just" start by, say, a Turtle representation of a Graph, I can of 
> course convert that into canonical n-quads and hash the n-quads. But if the 
> same Turtle representation is converted by RDFLib (or any other tool) into, 
> God forbid, RDF/XML, the BNode identifiers will be different, ie, the 
> conversion of the RDF/XML to n-quads will be different and, consequently, 
> the hash will be different. *Unless the RDF canonicalization assigns the 
> canonical identifiers to the BNodes in the process.*

I really don't understand this point.  If I start with a Turtle document, just 
send the Turtle.  Well, except for the problem that deserializing Turtle 
documents doesn't always produce isomorphic graphs.  But the solution to this 
is easy, just use a format that always produces isomorphic graphs.  Send 
that.  No canonicalization necessary as each deserialization will produce an 
isomorphic graph.  And the hash is done on the document itself so standard 
methods for verifiable transmission of documents can be used without 
modification.

If the starting point is a document in some other format, have the sender 
convert it to the appropriate format using the environment that the sender 
considers appropriate and send the resulting document.  If the starting point 
is an actual RDF graph, serialize the graph in the appropriate format and send 
the resulting document.  In each case, because deserialization in the document 
format produces isomorphic graphs, the recipient will end up with a graph 
isomorphic to the graph that the sender wanted to send.

Which document format to use?  As far as I can tell, N-Triples (N-Quads) is 
the only document format where deserialization produces isomorphic RDF graphs 
(datasets).  Well, except for case normalization of language tags.

> So I am not really sure I actually understand your problem: you cannot avoid 
> a canonical relabeling of the BNodes in the general case. That is what the 
> abstract RDF canonicalization does: define canonical BNode labels in a 
> serialization independent manner. In my view, that is absolutely necessary 
> in general.

But there is no need for this if all you are trying to do is to verifiable 
transmission of isomorphic RDF graphs.

>
> Ivan
>
[...]

peter

Received on Thursday, 10 June 2021 10:07:28 UTC