- From: Orie Steele <orie@transmute.industries>
- Date: Wed, 31 Mar 2021 10:02:36 -0500
- To: David Waite <dwaite@pingidentity.com>, "W3C Credentials CG (Public List)" <public-credentials@w3.org>
- Message-ID: <CAN8C-_KtZmyrXfNe2fSkBK_AXb+dB2_FByx3PvkrTWw_Yb8-_A@mail.gmail.com>
Sorry I meant for this reply to David to go to the whole list. In case I come across as some JOSE hater, I am not... I actually worked very hard with Mike Jones and others to add JWK support to the DID Core Spec, and have also worked to make it easy to produce JWS/JWT and LD Proofs from the same verification material: https://w3c-ccg.github.io/lds-jws2020/ OS > On Tue, Mar 30, 2021 at 11:49 PM David Waite <dwaite@pingidentity.com> > wrote: > >> >> <https://www.pingidentity.com>[image: Ping Identity] >> <https://www.pingidentity.com> >> David Waite >> Principal Technical Architect, CTO Office >> dwaite@pingidentity.com >> w: 303 468 2855 >> Connect with us: [image: Glassdoor logo] >> <https://www.glassdoor.com/Overview/Working-at-Ping-Identity-EI_IE380907.11,24.htm> [image: >> LinkedIn logo] <https://www.linkedin.com/company/21870> [image: twitter >> logo] <https://twitter.com/pingidentity> [image: facebook logo] >> <https://www.facebook.com/pingidentitypage> [image: youtube logo] >> <https://www.youtube.com/user/PingIdentityTV> [image: Blog logo] >> <https://www.pingidentity.com/en/blog.html> >> <https://www.google.com/url?q=https://www.pingidentity.com/content/dam/ping-6-2-assets/Assets/faqs/en/consumer-attitudes-post-breach-era-3375.pdf?id%3Db6322a80-f285-11e3-ac10-0800200c9a66&source=gmail&ust=1541693608526000&usg=AFQjCNGBl5cPHCUAVKGZ_NnpuFj5PHGSUQ> >> <https://www.pingidentity.com/en/events/d/identify-2019.html> >> <https://www.pingidentity.com/content/dam/ping-6-2-assets/Assets/Misc/en/3464-consumersurvey-execsummary.pdf> >> <https://www.pingidentity.com/en/events/e/rsa.html> >> <https://www.pingidentity.com/en/events/e/rsa.html> >> <https://www.gartner.com/reviews/vendor/write/ping-identity/?utm_content=vlp-write&refVal=vlp-ping-identity-32202&utm_campaign=vendor&utm_source=ping-identity&utm_medium=web&arwol=false> >> <https://www.gartner.com/reviews/vendor/write/ping-identity/?utm_content=vlp-write&refVal=vlp-ping-identity-32202&utm_campaign=vendor&utm_source=ping-identity&utm_medium=web&arwol=false> >> >> >> >> On Tue, Mar 30, 2021 at 9:43 AM Orie Steele <orie@transmute.industries> >> wrote: >> >>> Overall I agree with a lot of David's comments. >>> >> <snip> >> >>> A couple observations.... >>> >>> base64 in jose is a form of canonicalizing... because header and payload >>> objects might have different orderings, but base64url encoding makes those >>> orderings opaque... by inflating them 33%. >>> >> >> Canonicalization means to convert multiple potential representations of >> equivalent data into a single representation. I would define what JOSE does >> as straight-up processing transforms. The url-safe base64 encoding protects >> the data from modification in transport. >> > > agreed, inflating data 33% is clearly not canonicalization. > >> >> You can even turn the b64 encoding step off (RFC 7797) if your payload is >> already URL safe, or if you are doing detached signatures. >> >> canonicalize in the LD Proof could be JCS, or simple sorting of JSON >>> Keys... or RDF Data Set Normalization... each would yield a different >>> signature... >>> >> >> Not just that - each would cover a different interpretation of data. Your >> signature does not prevent abuse from equivalent forms. >> > > I suppose the same problem exists with JOSE, just no standard for how to > interpret fields that are not registered. > >> >> If you are using LD-Proofs, you either need to process the resulting data >> _as RDF_ or have additional rules for processing to further lock down any >> abuses that might come from misinterpreting the RDF because you are looking >> at it through a manipulated set of JSON-LD lenses. >> > > Here you are asserting that somehow canonicalization destroys information, > if that were true it would be a problem. If you can't tell if some JSON is > equivalent to some canonical form, that would also be a problem. > Luckily both are achievable, with both JCS and RDF DataSet Canonicalization. > > I do agree that it's more work to think about canonical information > representations than it is to inflate a payload 33% and make it url safe... > it's also more useful for very large datasets. > >> >> >>> mechanically, the fact that JCS exists hints at the problem with JOSE... >>> if you want to sign things, you want stable hashes, and therefore need SOME >>> form of canonicalization for complex data structures. >>> >>> JOSE works very well for small id tokens, like the ones that are used in >>> OIDC / OAuth... JOSE totally doesn't scale to signatures over large data >>> sets without another tool. >>> >> >> Sure, you are talking about reducing arbitrary subsets of a potentially >> modified document back to some chosen canonical form and then seeing if >> there was a pertinent modification. This is what XML DSig was made for :-) >> > > I don't think I am old enough to know what XML DSig is... sounds like it > was traumatizing : ) > > If your general point is that schema based languages or types are bad, I > would say that they increase friction and burden, and that pays off when > the code base or problem space gets very large... again consider a generic > solution to strongly typed data in an open world model. > > >> >> Turns out in a lot of use-cases, that subset is usually "a well defined >> block of data" and pertinent modifications are usually "any modification >> whatsoever". Crypto in that case is being used for send-and-receive, or >> archive-and-restore, and not for doing a verification as part of a larger >> dataset. >> >> When that isn't the case, you have a significantly harder task, such as >> what is currently in progress as HTTP Message Signatures. >> > > Agreed, HTTP Signatures require canonicalization of the HTTP Request Data > Structures... because they are complex, and you want to make sure everyone > is signing things the same way. > > >> >>> "Detached JWS with Unencoded Payload": >>> >>> https://tools.ietf.org/html/rfc7515#appendix-F >>> https://tools.ietf.org/html/rfc7797 >>> >>> This is how the JWS for LD Proofs are generated, and the "Unencoded >>> payload part" is the result of the canonicalization algorithm.... >>> >>> What would happen if we just decided to use "Unencoded Payload" without >>> canonicalization?... maybe we just use JSON.stringify? >>> >> >> Intermediaries may do things like convert from LF to CRLF and back, so >> you would want to keep people treating the data as binary, and make the >> data behave as binary in transit Exchange IIRC used to change the line >> encoding of *.txt files _inside ZIP archives_. CRLF is also now considered >> a grapheme, and will canonicalize down in some unicode tools as well. >> > > I'm not sure I follow fully, but if you are suggesting a binary format > would be better, I agree, however having worked with COSE a little, I can > say that binary formats require a significant amount of up front tooling to > offer the same level of developer experience that JOSE has... despite its > limitations, JOSE is fairly trivial to implement and to debug. > > >> >> it still works!... sorta... now I can generate a new message and >>> signature for every ordering of data in the payload... for a really complex >>> and very large payload, that's going to be a LOT of deeply equal objects... >>> that each yield a different signature... this can lead to storing a massive >>> amount of redundant but indistinguishable data... which can lead to >>> resource exhaustion attacks. >>> >> >>> In fact, the sidetree protocol uses JCS for this exact reason... >>> https://identity.foundation/sidetree/spec/#default-parameters >>> >> >> The attacker still has to send all of that redundant data - and they >> could always make it non-redundant by making any canonical change >> (including changing the string "José" to "José".) >> >> Yes, defense in depth requires validating untrusted user input... IMO > part of that is asking for canonical representations from users... here is > another thread on the subject: > > https://github.com/matrix-org/matrix-doc/issues/1013 > > > So I would consider this more a cache optimization (still important) than >> an attack solution. >> >> So in summary, in any JOSE library you can replace JSON with JCS and get >>> better signatures, and developers will thank you because they won't be >>> tracking down bugs related to duplicate content... and canonicalization can >>> also lead to security issues if not handled properly... regardless of how >>> you canonicalize things. >>> >> >> I'm not quite sure the scenario of "bugs related to duplicate content" - >> if you are allowing repeated changes of data, filtering out non-canonical >> changes is an optimization. Your policy is still apparently to allow a ton >> of changes to data. >> > > canonicalization helps detect content that can lead to bugs... similar to > how types and schemas help with that... obviously use case matters here, > but from a tooling perspective you can use schemas and canonicalization or > you can decide not too... for some use cases, that decision will yield a > lot of cost for your engineering team, for others it won't. > >> >> Since you would be using detached signatures, you would necessarily break >> the semantics of existing deployments and tools. You would have to define >> the semantics for how to transfer that new data since there are no JWS+JCS >> formats or best practices. And this would save no data over another >> JWS+detached JSON transmission format. >> > > https://tools.ietf.org/html/draft-jordan-jws-ct-02 > > Regarding JSON over the wire, I agreed the only thing that would make JSON > over the wire worse would be base64url encoding it.... assuming it was > large JSON. > > >> >> I particularly think developers in languages such as Rust, Go, and C >> would be less than excited about the opportunity to be the first to >> contribute a JCS implementation to their respective platforms. Even less so >> if they find out they need to build new JSON tooling for strict Ecmascript >> and I-JSON serialization and conformance. >> > > https://github.com/search?p=2&q=JSON+Canonicalization > > Looks like there is support in those languages and more... I suppose those > languages are already used to being forced to support JSON in order to use > JOSE. > > >> >> *CONFIDENTIALITY NOTICE: This email may contain confidential and >> privileged material for the sole use of the intended recipient(s). Any >> review, use, distribution or disclosure by others is strictly prohibited. >> If you have received this communication in error, please notify the sender >> immediately by e-mail and delete the message and any file attachments from >> your computer. Thank you.* > > > > -- > *ORIE STEELE* > Chief Technical Officer > www.transmute.industries > > <https://www.transmute.industries> > -- *ORIE STEELE* Chief Technical Officer www.transmute.industries <https://www.transmute.industries>
Received on Wednesday, 31 March 2021 15:04:05 UTC