Re: FW: VC-if-eye a plain old .JPG file [was: Binding credentials to publicly accessible repositories] from Bob Wyman on 2026-04-01 (public-credentials@w3.org from April 2026)

From: Bob Wyman <bob@wyman.us>
Date: Wed, 1 Apr 2026 15:52:02 -0400
To: Daniel Hardman <daniel.hardman@gmail.com>
Cc: "Michael Herman (Trusted Digital Web)" <mwherman@parallelspace.net>, "public-credentials (public-credentials@w3.org)" <public-credentials@w3.org>
Message-ID: <CAA1s49WLo=qP0JNjDHks7Sw-y2_EiZhHs2E2AGJDz1zMaa_PQQ@mail.gmail.com>
Daniel, Thank you for the pointer to BES
<https://dhh1128.github.io/papers/bes.html> and CFA
<https://dhh1128.github.io/cfa/>. I had not seen these and they are well
worth reading. The bytewise SAID algorithm is an elegant solution to a real
problem, and the observation about Dublin Core interoperability (§3.3
<https://dhh1128.github.io/papers/bes.html#:~:text=3.3%20Inserting%20the%20SAID>)
is a genuine bonus.

I think you are right on both your main points for the general case, but my
use case departs from it in interesting ways.

On issuer-holder-verifier fit:
I agree that generic digital files are a poor fit. The use case I am
pursuing is more specific: code provenance at the level of individual
functions, where there is genuine holder standing. The LLM or developer
that generated the function has standing as a holder. They made a specific
claim about the function's derivation. The CI pipeline or human reviewer
has standing as a verifier. They make an enforcement decision based on that
claim. The issuer-holder-verifier model fits this specific use case better
than the generic document case, even if it fits poorly for a DICOM image or
a CAD drawing.

On hashing as the right primitive for tamper evidence:
I agree fully. The ni: URI in the credentialSubject.id is the hash-based
tamper evidence layer. The VC is not doing tamper detection; the AST
content hash is. They serve distinct purposes: the hash binds the
credential to a specific implementation; the VC attests the provenance of
that implementation.

Where BES and this use case diverge: granularity:
BES is designed for file-level identity, and by design it treats file
internals as opaque. Your §3.5
<https://dhh1128.github.io/papers/bes.html#:~:text=3.5%20Multiple%20SAID%20references%20in%20a%20single%20file>
states this explicitly: "Neither the bytewise algorithm nor the
externalized algorithm can deal with the concept of nesting, since the
internal structure of the file MUST be treated as opaque."

The source code use case requires sub-file granularity. A single Python
module may contain dozens of functions with different provenance
categories. Some will be verified against a specification, some synthesized
from a named pattern, some WAGs (LLM-generated Wild-Ass-Guesses). A
file-level SAID certifies the file as a whole; it says nothing about
whether any individual function within it is a WAG. That distinction is
precisely what matters for the enforcement question: "no WAGs in shipping
code" is a function-level policy, not a file-level policy.

The source file is simultaneously a unit (a deployable artifact) and a
container (of individually provenance-tagged computational contracts). BES
handles the unit case well. The container case requires a different
mechanism that can identify individual functions within the file by their
semantic content rather than their location or the file's entire byte
stream.

The AST content hash is the right primitive for sub-file granularity for
the same reason BES uses a hash at file level: it is content-addressed,
stable under cosmetic changes (whitespace, comments), and sensitive to
semantic changes (logic, structure). The question of what URI scheme to use
for a function-level content hash remains open, and I am curious whether
KERI's SAID encoding would be preferable to ni: (RFC 6920
<https://www.rfc-editor.org/rfc/rfc6920.html>) for this purpose. Both are
content-addressed URI schemes; SAID has the self-describing digest code
property that ni: lacks.

On Michael's point about location-based DIDs:
I appreciate the auto-resolvability insight. A verifier who can follow the
URI to the source is useful. But location-based identifiers are unstable
under refactoring (line numbers, method names, and repository paths all
change) and stable under semantic changes (if you change the logic but keep
the name, the ID doesn't change). This inverts the stability properties we
need. A hybrid seems right: content hash as the primary stable identifier,
resolvable location URI as a secondary sameAs link.

bob wyman


On Wed, Apr 1, 2026 at 2:14 PM Daniel Hardman <daniel.hardman@gmail.com>
wrote:

> It seems to me that the mechanics of how to do this can be worked out, but
> that there may be some conceptual mismatches that are more interesting.
>
> In the generic case, a digital file with arbitrary content (a PDF, a
> spreadsheet, a DICOM xray/ultrasound image, a genetic testing report, a CAD
> drawing, a 3D printer recipe) doesn't feel to me like it fits the
> issuer-holder-verifier model very well. In many cases, a document creator
> could play the role of issuer. But if there are multiple creators? The
> purpose of verifiable assertions can be very broad, but for *credentials*
> specifically, the reason a holder role exists is because the holder is
> intended to have some special standing with respect to the credential
> (typically, the recipient of extra trust because of what the credential
> says, and who said it). Generically, digital files don't have that purpose,
> although there are certainly specific kinds of digital files in specific
> use cases that do.
>
> If we're trying to assert authorship/ownership of content, I think that
> the content authenticity work addresses it, at least for some doc types.
>
> If we're trying to achieve tamper evidence, it seems to me that hashing is
> the right primitive, not adapting the doc to the VC credential model.
> Although chains can be constructed to link VCs to one another, chaining is
> not a VC feature; it's a non-standard extension. The SAID mechanism in KERI
> is an elegant approach that has numerous things to recommend it (Including
> incredibly powerful and nuanced chaining), and it can be adapted from JSON
> to almost any arbitrary document or file type. This allows all the doc
> types I listed above, and many others, to participate in verifiable data
> graphs, with no adaptation of the AST and with only a single metadata
> field. See https://dhh1128.github.io/papers/bes.html. You can even stitch
> together collections of files and make the collections self-describing and
> tamper-evident and coherent+verifiable as a unit. See
> https://dhh1128.github.io/cfa.
>
> On Wed, Apr 1, 2026 at 9:08 AM Michael Herman (Trusted Digital Web) <
> mwherman@parallelspace.net> wrote:
>
>> Bob, does the VC data need to become part of the AST or is it sufficient
>> for the VC data to appear as a comment block (e.g. at the end of the code
>> file)?
>>
>> Get Outlook for Android <https://aka.ms/AAb9ysg>
>> ------------------------------
>> *From:* Bob Wyman <bob@wyman.us>
>> *Sent:* Tuesday, March 31, 2026 11:27:36 PM
>> *To:* Michael Herman (Trusted Digital Web) <mwherman@parallelspace.net>
>> *Cc:* public-credentials (public-credentials@w3.org) <
>> public-credentials@w3.org>
>> *Subject:* Re: FW: VC-if-eye a plain old .JPG file [was: Binding
>> credentials to publicly accessible repositories]
>>
>>
>> Michael,
>> Thank you — useful prior art, and it confirms that embedding VCs in
>> non-document artifacts is tractable. The XMP/EXIF approach works well for
>> file-level attestation of binary media.
>>
>> The source code case has two properties that make it worth asking the
>> question more carefully:
>>
>>
>>    - First, the granularity is sub-file. The provenance claim attaches
>>    to a specific function, not to the file as a whole. A file may contain
>>    dozens of functions with different provenance categories — some verified,
>>    some pattern-derived, some WAGs. File-level attestation doesn't help here.
>>    - Second, source code undergoes transformations that binary media
>>    does not — reformatting, refactoring, minification, transpilation. A
>>    metadata field stripped by a formatter is silently lost. The docstring
>>    survives most of these transformations because it is semantically part of
>>    the code. An external or header-based embedding may not.
>>
>> So the question I'm really asking is not "can a VC be embedded in a
>> source file" — clearly it can, your 2021 example shows one approach and
>> mine shows another. The question is: given the sub-file granularity
>> requirement and the transformation-survival requirement, is the docstring
>> Provenance section the right mechanism? Are there alternatives that handle
>> these requirements better? And are there fragilities in my approach that I
>> haven't considered — the AST normalization question being the obvious one —
>> and noting that while Python has a single reference parser with a stable
>> AST grammar, the same approach faces significant complications in other
>> languages: JavaScript has multiple competing parsers; C++ preprocessing
>> happens before parsing so semantically identical functions may have
>> different ASTs depending on macro expansion; Erlang's pattern matching and
>> guard syntax have no straightforward cross-language AST equivalent. A
>> language-agnostic solution to the subject identification problem may need a
>> different approach entirely, or a per-language normalization specification
>> as part of the content type convention.
>>
>> bob wyman
>>
>> On Tue, Mar 31, 2026 at 11:39 PM Michael Herman (Trusted Digital Web) <
>> mwherman@parallelspace.net> wrote:
>>
>> Here’s something from 2021…what do you see as the challenge with embedded
>> a VC in any document? E.g. code, Word doc, XML purchase order, a photo?
>>
>>
>>
>> *From:* Michael Herman (Trusted Digital Web)
>> *Sent:* Sunday, August 8, 2021 11:48 PM
>> *To:* Leonard Rosenthol <lrosenth@adobe.com>; public-credentials@w3.org
>> *Subject:* VC-if-eye a plain old .JPG file [was: Binding credentials to
>> publicly accessible repositories]
>>
>>
>>
>> RE: There is no mechanism in XMP nor in most standard asset formats for
>> establishing a model for tamper evidence, such as Digital Signatures,
>> (H)MAC, etc
>>
>>
>>
>> Leonard here’s a counterexample.
>>
>>
>>
>> I’ve applied to the principles and data model for Structured Credentials (
>> https://www.youtube.com/watch?v=FFv4WZ0p3aY&list=PLU-rWqHm5p45dzXF2LJZjuNVJrOUR6DaD&index=1)
>> to VC-if-eye a plain old .JPG file (a photo I took with my Pixel 4a phone).
>>
>>
>>
>>    - Test1_original.jpg is the original, unmodified test copy of the
>>    photo
>>    - Test1_ps1.txt is a script that uses exiftool.exe to add the
>>    Structured Credential structures to the original test copy of the photo
>>    …including the proof elements stored in the EnvelopeSeal structure.  They
>>    are stored as elements in the XMP Keyword Info property of the photo.
>>    - Test1_xmp.png is a screen shot of the Structured Credential
>>    structures embedded into the XMP Keyword Info properties of the photo.
>>
>>
>>
>> We now have a mechanism for a .JPG file (or any XMP compatible media
>> file) to serve as both a photo and a Verifiable Credential.  We are now
>> able to VC-if-eye any XMP compatible media file.
>>
>>
>>
>> Any holes?
>>
>>
>>
>> Best regards,
>>
>> Michael Herman
>>
>> Far Left Self-Sovereignist
>>
>>
>>
>> Self-Sovereign Blockchain Architect
>>
>> Trusted Digital Web
>>
>> Hyperonomy Digital Identity Lab
>>
>> Parallelspace Corporation
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *From:* Leonard Rosenthol <lrosenth@adobe.com>
>> *Sent:* August 3, 2021 3:02 PM
>> *To:* Michael Herman (Trusted Digital Web) <mwherman@parallelspace.net>;
>> public-credentials@w3.org
>> *Subject:* Re: Binding credentials to publicly accessible repositories
>>
>>
>>
>> There is no mechanism in XMP nor in most standard asset formats for
>> establishing a model for tamper evidence, such as Digital Signatures,
>> (H)MAC, etc.
>>
>>
>>
>> Leonard
>>
>>
>>
>> *From: *Michael Herman (Trusted Digital Web) <mwherman@parallelspace.net>
>> *Date: *Tuesday, August 3, 2021 at 2:49 PM
>> *To: *public-credentials@w3.org <public-credentials@w3.org>, Leonard
>> Rosenthol <lrosenth@adobe.com>
>> *Subject: *Re: Binding credentials to publicly accessible repositories
>>
>> Leonard, how do you define "native tamper-evident system"?
>>
>> Get Outlook for Android <https://aka.ms/AAb9ysg>
>>
>>
>> ------------------------------
>>
>> *From:* Leonard Rosenthol <lrosenth@adobe.com>
>> *Sent:* Tuesday, August 3, 2021 10:53:47 AM
>> *To:* Michael Herman (Trusted Digital Web) <mwherman@parallelspace.net>;
>> public-credentials@w3.org <public-credentials@w3.org>
>> *Subject:* Re: Binding credentials to publicly accessible repositories
>>
>>
>>
>> Michael, thanks for the reference to XMP…but you are probably not aware
>> that I am the chair of ISO TC 171/SC 2/WG 12 where XMP is standardized *
>> *and** the project leader for XMP itself.   (oh, and I am also the XMP
>> Architect internally to Adobe 😉 ).
>>
>>
>>
>> So yes, leveraging existing open standards such as XMP is indeed a key to
>> delivering on the promises mentioned below – but it can’t be the only
>> solution due to it being a text-based serialization (thus not lending
>> itself well to binary data structures) and not having a native
>> tamper-evident system.  Additionally, while it is supported by most common
>> asset formats, it is not supported by all.
>>
>>
>>
>> Leonard
>>
>>
>>
>> *From: *Michael Herman (Trusted Digital Web) <mwherman@parallelspace.net>
>> *Date: *Tuesday, August 3, 2021 at 10:43 AM
>> *To: *Leonard Rosenthol <lrosenth@adobe.com>, public-credentials@w3.org <
>> public-credentials@w3.org>
>> *Subject: *Re: Binding credentials to publicly accessible repositories
>>
>> Checkout https://en.wikipedia.org/wiki/Extensible_Metadata_Platform
>>
>>
>>
>> And here's a data model to consider for use in a custom XMP profile:
>> https://youtu.be/FFv4WZ0p3aY
>>
>>
>>
>> Get Outlook for Android <https://aka.ms/AAb9ysg>
>> ------------------------------
>>
>> *From:* Michael Herman (Trusted Digital Web) <mwherman@parallelspace.net>
>> *Sent:* Friday, July 30, 2021 1:25:18 PM
>> *To:* Leonard Rosenthol <lrosenth@adobe.com>; public-credentials@w3.org <
>> public-credentials@w3.org>
>> *Subject:* RE: Binding credentials to publicly accessible repositories
>>
>>
>>
>> So an alternate strategy to avoid embed an actual VC or otherwise try to
>> attach a VC to an asset is to use the metadata capabilities of each of
>> these formats to store the credential id, @context, vc type list,
>> credentialSubject id, the individual claims (name-value pairs), and the
>> proof elements
>>
>>
>>
>> …vc-if-eye each format using each format’s native metadata capabilities.
>>
>>
>>
>> *From:* Leonard Rosenthol <lrosenth@adobe.com>
>> *Sent:* July 30, 2021 1:03 PM
>> *To:* Michael Herman (Trusted Digital Web) <mwherman@parallelspace.net>;
>> public-credentials@w3.org
>> *Subject:* Re: Binding credentials to publicly accessible repositories
>>
>>
>>
>> Michael – not sure you understand the scenario here.
>>
>>
>>
>> We aren’t building a specific system/solution for our own needs and those
>> of our customers – we are developing an open standard that associates
>> provenance with existing assets (eg. JPEG, PNG, MP4, PDF, etc.).  Since
>> those are the formats that are recognized by systems (and regulatory
>> solutions) today, it would make no sense to start wrapping them in some
>> other format (be it JSON, XML, or whatever).  JPEG files (for example) need
>> to work everywhere they do today – BUT contain tamper-evident provenance.
>>
>>
>>
>> Leonard
>>
>>
>>
>> *From: *Michael Herman (Trusted Digital Web) <mwherman@parallelspace.net>
>> *Date: *Friday, July 30, 2021 at 2:46 PM
>> *To: *Leonard Rosenthol <lrosenth@adobe.com>, public-credentials@w3.org <
>> public-credentials@w3.org>
>> *Subject: *RE: Binding credentials to publicly accessible repositories
>>
>> It’s a SMOP (small matter of programming).  Once upon a time, browers
>> weren’t capable of displaying a lot of different kinds of resources (e.g.
>> XML).
>>
>>
>>
>> Why not render your VCs as XML?
>>
>> …or consider using server-side rendering?
>>
>> …or write an in-browser renderer using WASM?
>>
>>
>>
>> *“The difficult we can do, the impossible takes us a little bit longer…” *
>> 😊
>>
>>
>>
>> *From:* Leonard Rosenthol <lrosenth@adobe.com>
>> *Sent:* July 30, 2021 12:35 PM
>> *To:* Michael Herman (Trusted Digital Web) <mwherman@parallelspace.net>;
>> public-credentials@w3.org
>> *Subject:* Re: Binding credentials to publicly accessible repositories
>>
>>
>>
>> Given that putting a “.vc” file on a website or in a Twitter feed of
>> YouTube channel isn’t going have it properly displayed – that’s not an
>> option, unfortunately, Michael.
>>
>>
>>
>> Leonard
>>
>>
>>
>> *From: *Michael Herman (Trusted Digital Web) <mwherman@parallelspace.net>
>> *Date: *Friday, July 30, 2021 at 1:05 PM
>> *To: *Leonard Rosenthol <lrosenth@adobe.com>, public-credentials@w3.org <
>> public-credentials@w3.org>
>> *Subject: *RE: Binding credentials to publicly accessible repositories
>>
>> I suggest storing the “original version” of the artwork as a claim within
>> a signed credential …the credential wraps the artwork like a container or a
>> “frame”.
>>
>>
>>
>> I believe this is much better than trying to attach a credential to the
>> artwork.
>>
>>
>>
>> Best regards,
>>
>> Michael Herman
>>
>> Far Left Self-Sovereignist
>>
>>
>>
>> Self-Sovereign Blockchain Architect
>>
>> Trusted Digital Web
>>
>> Hyperonomy Digital Identity Lab
>>
>> Parallelspace Corporation
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *From:* Leonard Rosenthol <lrosenth@adobe.com>
>> *Sent:* July 30, 2021 10:31 AM
>> *To:* public-credentials@w3.org
>> *Subject:* Binding credentials to publicly accessible repositories
>>
>>
>>
>> I realize that I might be out on the bleeding edge a bit, though not
>> completely as I think it is very similar to what OpenBadges will face as
>> they move to VC’s…
>>
>>
>>
>> In the Trust Model section of the VC Data Model spec, it states that one
>> of aspects of that model is:
>>
>> The holder trusts the repository to store credentials securely, to not
>> release them to anyone other than the holder, and to not corrupt or lose
>> them while they are in its care.
>>
>> This is certainly true when the repository in question is something like
>> a wallet that is designed to be kept private or local (not shared).  But
>> what happens when the repository is designed to be out in the public… such
>> as an image or PDF with the VC embedded?
>>
>>
>>
>>
>>
>> As part of the C2PA’s (https://c2pa.org) work on establishing provenance
>> for digital assets, we will be using VC’s as a way for persons and
>> organizations to establish their relationships to the asset.  Specifically
>> in this instance, we’re extending schema.org’s Person and Organization
>> schemas, as used by their CreativeWork schema, to support referencing a
>> VC.  This then allows the author or publisher (or any of the other roles in
>> CW) to provide their credentials in that role, which (a) adds useful trust
>> signal(s) to the end consumer and (b) helps establish reputation.
>>
>>
>>
>> These VC’s (etc.) will be embedded into the assets (e.g., video, images,
>> documents, etc.) in a tamper-evident manner, so that in addition to the
>> individual VC’s “proof”, any attempt to change the CreativeWork
>> relationships, etc. can also be detected.   This all works great.
>>
>>
>>
>> However, in doing some threat modelling, we recognized that we have no
>> protection against a malicious actor simply copying the VC from one asset
>> and dropping it into another (and then signing the new setup), because
>> there is nothing that binds the credential to the asset in our case.
>>
>>
>>
>> Has anyone run into this scenario before and has some guidance to offer?
>> Am I doing something that I shouldn’t be doing – and if so, what does that
>> mean for OpenBadges?
>>
>>
>>
>> All thoughts and suggestions welcome!
>>
>>
>>
>> Thanks,
>>
>> Leonard
>>
>>
>>
>>
Attachments

image/jpeg attachment: image003.jpg
image/jpeg attachment: image004.jpg
Received on Wednesday, 1 April 2026 19:52:22 UTC