Re: Chartering work has started for a Linked Data Signature Working Group @W3C from Melvin Carvalho on 2021-05-23 (semantic-web@w3.org from May 2021)

From: Melvin Carvalho <melvincarvalho@gmail.com>
Date: Sun, 23 May 2021 13:20:43 +0200
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Cc: Ivan Herman <ivan@w3.org>, semantic-web <semantic-web@w3.org>
Message-ID: <CAKaEYhKj6pv6UjnZRDE3O_E+LTD4BpkA_mS8BzcHZQmK5xkWyQ@mail.gmail.com>
On Sat, 22 May 2021 at 14:37, Peter F. Patel-Schneider <
pfpschneider@gmail.com> wrote:

> Hi Ivan:
>
> As you should have suspected I have a very different take on this.
>
> Sure any WG can take inputs and work on them.   But my, admittedly
> non-expert,
> view here is that the major input has significant flaws, and in computer
> security any flaw is fatal.  I've pointed out one but I think there are
> others.  (See below.)
>
> I am in favour of W3C providing some way of securely transmitting RDF
> graphs
> and datasets.  Of course there already is a way of doing this by simply
> treating the serialization of the graph or dataset as a text document and
> transmitting that document bundled with its signature, much the same way
> that
> emails are signed.  The goal is to do something better.
>
> My worry is that going through AC review with the proposed charter using
> Linked Data Proofs 1.0 as its major support will result in the working
> group
> being turned down because of flaws in Linked Data Proofs 1.0.
>
> I would greatly appreciate a discussion of the possible flaws in that
> document.  This discussion does not appear to be happening, which I find
> worrisome.
>

I'm with Peter on this one

I very much welcome work on canonicalization and signatures of linked data,
and it's something we've needed for years

However, IMHO the referenced material is not quite ready to go to a WG,
just yet

Let's take the 4th deliverable:

Linked Data Security Vocabulary (LDSV)

This has a normative reference on :
https://w3c-ccg.github.io/security-vocab/

"The Security vocabulary is used to enable Internet-based applications to
encrypt, decrypt, and digitally sign information expressed as Linked Data.
It also provides vocabulary terms for the creation and management of a
decentralized Public Key Infrastructure via the Web"

Brilliant, so far, so good!

However, looking at the terms inside this vocabulary.  The scope is bigger
than simply just encrypt, decrypt and sign

For example, the normative vocab includes a term, shoe-horned in,
ethereumAddress, without including, for example bitcoinAddress.  This is
highly controversial.  Why include one and not the other.  Surely it is
better to include neither in a generic vocabulary:

This is the kind of thing that could blow up in the face of the W3C.
There's approximately 100 million users, with skin in the game, that might
see the W3C as picking favourites here.  Given the way social media works,
that could spark some objections from large ecosystem, including some of
the best and most cited cryptographers in the world

I've raised an issue on this here:

https://github.com/w3c-ccg/security-vocab/issues/110

So, IMHO the normative references could benefit from a bit of work, to make
them less controversial, before going to WG


>
> peter
>
>
> Technical Details:
>
> I take the method to sign and verify RDF datasets to be as follows:
>
> sign(document, private key, identity)
>    let D be the RDF dataset serialized in document
>    let C be the canonicalized version of D
>    let S be triples representing a signature of C using private key
>    let signed document be document plus a serialization of S,
>      so signed document serializes D union (not merge) S
>    return signed document
>
> verify(signed document)
>    let D' be the RDF dataset serialized in signed document
>    let S be the signature in D'
>    let D be D' - S
>    let C be the canonicalized version of D
>    return whether S is a valid signature for C
>
> To my non-expert eye there are several significant problems here.
> 1/ The signature extracted from the signed document might be different
> from
> the signature used to sign the original document if the original document
> has
> signatures in it.
> 2/ The dataset extracted during verification might not be the dataset used
> during signing because
> the original document if the original document has signatures in it.
> 3/ Adding extra information after signing might be possible without
> affecting
> verification if the extra information looks like a signature.
> 4/ The dataset extracted during verification might not be the dataset used
> during signing because the original document has relative IRIs.
> 5/ The dataset extracted during verification might not be the dataset used
> during signing because the original document is in a serialization that
> uses
> external resources to generate the dataset (like @context in JSON-LD) and
> this
> external resource may have changed.
> 6/ Only the serialized dataset is signed so changing comments in
> serializations that allow comments or other parts of the document that do
> not
> encode triples or quads results can be done without affecting the validity
> of
> the signature.  This is particularly problematic for RDFa.
>
> I welcome discussion of these points and am open to being proven wrong on
> them..
>
> On 5/22/21 6:43 AM, Ivan Herman wrote:
> > Peter,
> >
> > I agree that these are issues to handle/settle in a final specification.
> And
> > I let Manu reply to the specifics.
> >
> > However, I would regard these to be done during the life time of the
> Working
> > Group, if it gets approved; after all, making sure of these required
> quality
> > checks are one of the strong points of the W3C Process. The Linked Data
> > Proof draft specification is not even the FPWD of the WG's deliverables,
> it
> > is just a referenced document.
> >
> > Thanks
> >
> > Ivan
> >
>
>
Received on Sunday, 23 May 2021 11:22:09 UTC