- From: Eduardo C. <e.chongkan@gmail.com>
- Date: Mon, 16 Mar 2026 17:01:26 -0600
- To: bumblefudge von CASA <bumblefudge@learningproof.xyz>
- Cc: W3C Credentials CG <public-credentials@w3.org>
- Message-ID: <CAANnk0+bUinDYaV+G__Sv2y7LLHmhQ3JLzn_Jg24T2DeJxjMTA@mail.gmail.com>
Everyone, To be clear, what I submitted was a vide coded Assessment/Standardization TOOL. -- I wasn't submitting an assessment of all 225 Methods. It was a DEMO. -- I am not Submitting the Vibe Coded Code. It was also a DEMO/POC. -- I am not including LLM generated workflows as part of the proposal. Everyone can use LLM at their own discretion and all processes would email the same, where everyone is still owning quality and veracity. Regards, -- Eduardo Chongkan On Sun, Mar 15, 2026 at 12:03 AM bumblefudge von CASA < bumblefudge@learningproof.xyz> wrote: > Hey Eduardo: > > I'm with Manu on both points: > 1.) Don't ship un-factchecked AI slop to this (or really any W3C) mailing > list plz. Concision is a virtue. > 2.) Don't take an LLM's word on what each DID method can do as of today's > date! Pay a specialist researcher or volunteer at your local FOSS > non-profit. > > A much more _manual_ process[^1] for evaluating DID methods has been > undertaken at the Decentralized Identity Foundation[^2], building on older > work (in particular the DID Traits[^3] evaluative framework and the modular > Universal Resolver[^4] for testing and evaluation purposes). DID:webs has > been in the hotseat twice being asked hard questions about what's in prod > and what's unstable, and I'm with Manu, you gave them an unfair shake. > > More constructively, though, if there are additional DID methods you would > like to champion through an apples-to-apples tire-kicking and fact-checking > process, we are always looking for volunteers to spin up the docker > containers and run the tests locally and, most helpfully of all, PR in > harder tests! You'd be surprised how many `true==true?` tests Claude sneaks > in there when you're not double-checking... > > Thanks, > __bumblefudge > DIF/IPFS Fdtn > > [^1]: > https://github.com/decentralized-identity/did-methods/tree/main/dif-recommended > > [^2]: https://identity.foundation/ > [^3]: https://identity.foundation/did-traits/#abstract > [^4]: https://github.com/decentralized-identity/universal-resolver/ > > > --- > bumblefudge > janitor @Chain Agnostic Standards Alliance > <https://github.com/chainagnostic/CASA> > contractable via learningProof UG <https://learningproof.xyz> > mostly berlin-based > > > Sent from Proton Mail <https://proton.me/mail/home> for Android. > > > -------- Original Message -------- > On Sunday, 03/15/26 at 03:28 Eduardo C. <e.chongkan@gmail.com> wrote: > > Hi Manu, > > Thanks to you for the feedback and opportunity. > > About the specific errors you pointed out, the reason was that the missing > data in the spec was interpreted as "Not Met", that is caused by the specs > not meeting a standard in terms of what data they contain that we used for > the Matrix. > > Please note the assessments you see, and the demo itself *was indeed vibe > coded* *and the intention was to show the POC. I was expecting comments > on the data and if approved, I was also expecting that each method > owner would submit an updated JSON, having evaluated their spec against the > tool expected JSON. T**he form was initially thought as Newcomers > Onboarding Aid, not to be used as a de-facto assessing tool for the > Catalogue or the Rubric acceptance criteria but to help devs get an idea of > how everything works, how to standardize, and to have visibility of the > other methods and claims, be able to see what was missing for interops > between 2+ methods in an easier way, how other methods cover the > requirements, and so on. -- Not as a grading tool, but as visibility one. * > > Then: > > *1- Assessment Criteria — "What does 'met' mean?" * > > > > > > The 22 requirements exist, but there's no shared rubric for evaluating > them. For example: > > > > > - R6 (Key Rotation): Does "supported" mean the spec mentions it, the > spec defines a protocol, or there's a working implementation? did:webvh has > a full cryptographic log — that's clearly > beyond "met." did:key is generative (no rotation by design) — is that a > fail or N/A? > > - R11 (Cannot Be Administratively Denied): did:web depends on a domain — > does that fail R11? Or is it "partial" because the controller owns the > domain? > - R8 (Privacy Preserving): What bar? Any DID on a public chain is > correlatable. Is privacy about the DID itself or about credential > presentation? > > * Need from W3C: A shared rubric per requirement — what constitutes met, > partial, not applicable, and not met. Without this, any assessment is > subjective.* > > * 2. Maturity Model — Levels and Signals -- Stays, hoes or changes name? > Idea on this was to get visibility on the overall method development and > implementation process for teams. * > > Our current L0-L4 model uses signals like "has a spec," "has > implementations," "has tests." But: > > - What signals actually matter? GitHub stars? Number of implementations? > Spec completeness? Community size? Deployment in production? > - Is maturity linear (L0→L4) or multi-dimensional (spec maturity vs. > ecosystem maturity vs. security maturity)? > - Should "Legacy" be a maturity level or a separate lifecycle status? A > method can be mature AND legacy (like did:sov). > > > * Need from W3C: Agreement on maturity dimensions and what signals map to > which levels. Or — scrap prescriptive levels entirely and show raw signals, > letting readers draw their own conclusions.* > > * 3. Taxonomy — Categories and Lifecycle Status* > > - Registry types: "blockchain", "web", "peer", "ledger" — are these the > right buckets? did:webvh is web-based but has a cryptographic log — is it > "web" or something new? or both? > - Lifecycle: "Active", "Legacy", "Experimental", "Deprecated" — who > decides? The method community? The WG? Observable signals? > - Feature taxonomy: The 14 features we track (CRUD, key rotation, > multi-sig, DIDComm, etc.) — are these the right ones? Are we missing > capabilities the WG considers important? > > * Need from W3C: Blessed taxonomy for categories, lifecycle statuses, > and the feature set to evaluate.* > > * 4. Data Authority — Who Submits, Who Validates?* > > -* Self-assessment:* Method communities submit their own claims. Fast, > scales, but fox-guarding-henhouse risk. > *- Peer review:* Claims require validation by at least one independent > party. Slower but more credible. > * - WG-verified:* The DID WG blesses the data. Most authoritative but > doesn't scale. > *- Hybrid:* Self-assessment with a "verified" badge for WG-reviewed > claims. Unverified claims shown with a disclaimer. > > * Need from W3C: Which model, and what the review process looks like in > practice.* > > * 5. Scope - What Should v1 Cover?* > > The current tool tries to do everything, grades, use case mapping, > maturity, overlap analysis, gap detection, self-assessment. The feedback > suggests starting narrower: > > - *Option A: *Feature matrix only *(no grades).* Show what each method > claims to support, let readers compare. > * - Option B: *Feature matrix + community-sourced maturity signals. No > letter grades, just data. > *- Option C: *Full analysis but only for WG-verified methods. > Everything else shows "Not Assessed." > > *Need from W3C: What's the right scope for something they'd co-publish > vs. what stays as an independent community tool? I think maybe A, then B > later on. * > > * 6. Use Case to Requirement Mapping* > > I mapped 22 requirements to 18 W3C use cases. But: > > - The mapping itself is interpretive. Does "Prescriptions" really > require R15 (Cryptographic Future-Proof)? Does "Digital Executor" require > R20 (Registry Agnostic)? > - Should the mapping be normative (WG-blessed) or informative (editorial > opinion)? > > * Need from W3C: Review and sign-off on the use case → requirement > mapping, or agreement that it's informative-only.* > > Finally, Should I split these 6 decision threads into 6 Issues to discuss > them separately on the repo, under one parent issue? I would create those > once the WG agrees. > > In general: > > 1- Where should it reside? Same repo? > 2- What features do we keep, or add? > 3- I personally prefer using Tailwind + Vue.js for my projects, can we use > that? -- current version was plain html, vibe coded with no structural > guidance. > 4- I would include a .codex.md file so others can tell their LLM to refer > to that when working on the repo. > > Regards, > > Eduardo Chongkan > > > > On Sat, Mar 14, 2026 at 11:04 AM Manu Sporny <msporny@digitalbazaar.com> > wrote: > >> On Sat, Mar 14, 2026 at 1:19 AM Eduardo C. <e.chongkan@gmail.com> wrote: >> > There are more screenshots here; >> https://github.com/w3c/did-extensions/pull/677, >> > You can check the tool here: >> https://chongkan.github.io/did-extensions/explorer.html, it was fed form >> the repo as it was, and the idea is that all new methods, use the new json >> so that tool can stay up to date and run during CI -- ( if it gets approved >> ) >> >> Hey Eduardo, I really like your initiative and concept behind the >> tooling! Thank you for putting an example together so we can get an >> idea of the changes/tooling you'd like to see in the ecosystem. I >> agree that tools like these are going to be helpful to provide to the >> larger ecosystem. >> >> All that said, I think some of what you have done is dangerous and >> crosses "no-go" lines that the DID WG established a while ago. You >> might not be aware of this, but the DID WG (and the CCG) have been >> trying to establish a useful set of information to expose to people >> that are interested in DIDs. We haven't done some of the things you >> did in your PR because they were identified as things that DID Method >> authors would not like -- that is, there are places where your tool is >> making judgement calls on DID Methods that the tool could not have >> possibly reviewed and is publishing information that is grossly >> inaccurate. >> >> For example, your tool has given a rating of an "F" to did:webs, >> did:btcr, did:btco, and 44+ other DID Methods, on features that I know >> that many of them have. IOW, the tool is giving "F" grades to DID >> Methods for not having features that they definitely have. >> >> The tool has marked did:webvh as a Legacy DID Method and L0 maturity >> (the lowest maturity)... even though it is highly mature and that >> community is one of the most active DID Method communities right now. >> >> The tool has marked did:key is failing to meet every DID Use Case >> requirement (when it passes most of them). >> >> The tool suggests a maturity path for DIDs which is inaccurate... and so >> on. >> >> Are you interested in working with the DID Working Group to establish >> the things that we think we can safely publish and go from there. Your >> vision is a good one (as far as educating the masses and letting them >> filter/pick), however, it feels like you might have vibe-coded this >> thing together and the LLM made some really questionable choices. >> >> So, +1 to the general direction, but there are details here that >> matter and we might want to start with something that is more focused >> and less controversial. What do you think, Eduardo? >> >> -- manu >> >> -- >> Manu Sporny - https://www.linkedin.com/in/manusporny/ >> Founder/CEO - Digital Bazaar, Inc. >> https://www.digitalbazaar.com/ >> >>
Received on Monday, 16 March 2026 23:01:44 UTC