draft-koch-openpgp-webkey-service-21 early Httpdir review from Martin Thomson via Datatracker on 2026-02-18 (ietf-http-wg@w3.org from January to March 2026)

From: Martin Thomson via Datatracker <noreply@ietf.org>
Date: Tue, 17 Feb 2026 17:48:35 -0800
To: <ietf-http-wg@w3.org>
Cc: draft-koch-openpgp-webkey-service.all@ietf.org
Message-ID: <177137931513.1151378.16577022032103443499@dt-datatracker-6ff7c68975-7k42g>
Document: draft-koch-openpgp-webkey-service
Title: OpenPGP Web Key Directory
Reviewer: Martin Thomson
Review result: Not Ready

This is a preliminary review only.  I'm going to try to limit the feedback to
HTTP matters, though I have not managed to do that.

This document is three things:

* A means of accessing a directory of public keys over HTTP
* A predominantly SMTP-based protocol for populating those directories
* Security-relevant, because getting the wrong key will allow the wrong person
to read mail

This review is aimed at the first, though it covers the small role that HTTP
plays in the second.  I have questions about the third, but that's not my focus
here.

A lot of this review is a specialized repackaging of the content of RFC 9205. 
I encourage the authors of this proposal to review that document.

# The domain label

The protocol defines a special domain name label. 
https://datatracker.ietf.org/doc/html/rfc8820#name-uri-authorities explains why
taking a name out of the namespace controlled by hundreds of millions of domain
owners is unwise.  I strongly recommend against this practice.  This is more
the case that the "direct" version seems to be a more practical path anyway. 
The /.well-known prefix was reserved for exactly this sort of thing.

BTW, the fact that the domain is not part of the /.well-known path for the
"direct" mode seems backwards to me.  A domain that is setup to handle this
protocol exclusively is more likely to be hosting keys for multiple entities;
the current design forces that service to always include "openpgpkey" in its
name.  Overall, having the domain in the path always would be more robust.

# Redundant local part

It's not clear to me why the local part of the email address is included
multiple times.  The role of SHA-1 here is also pretty questionable (more
below).  This is something that percent encoding is pretty good at handling. 
The half-hearted attempt at canonicalization used here is a constraint on mail
operators (who might want t@ and T@ to go to different inboxes) and
insufficient to capture the range of equivalence practices (like
a.b.c@gmail.com == abc@gmail.com or the common foo+bar@ == foo@).  Better to
leave that to servers to manage.

# Case sensitivity

Domain name casing is weird.  Not in this specification, but more generally.  I
suggest that you reconsider the way this works presently.  Rather than push the
requirement to handle casing on clients, which makes it an interoperability
hazard, you can have the server handle it. 
/.well-known/openpgpkey/example.org/... and 
/.well-known/openpgpkey/Example.org/... can map to the same resource on the
server.  (Redirects and Content-Location exist to handle this sort of
equivalence.)

If you really want to tangle with this particular problem, which includes
internationalized domain names, the current design doesn't really hold up well.
 You can maybe mandate the use of A-labels, but you need to be careful about
how use of those is specified.

# Redirects

This doesn't say anything about redirects.  As a practical matter, if you are
using HTTP, you want to note that following redirects is expected.  This has
many benefits:

* An operator can delegate responsibility for running the API to another entity.
* A server can direct requests to equivalent URLs to a single canonical URL.
* You get consistent implementation of this important feature.  Otherwise, if
some clients will follow redirects and others don't, you can end up with no
meaningful interoperability.

# Media types

> The HTTP GET method MUST return the binary representation of the OpenPGP key
for the given mail address.

Where is this representation defined?

I see that the SMTP interactions are careful to define media types so that mail
processing can confidently handle different content.  The same courtesy is not
extended to HTTP resources.  The specification even recommends the use of
"application/octet-stream" for this.  As a general rule, that media type is not
a good idea when defining protocols, where having certainty about formats is
useful for managing version migration.

This applies to the published keys and the resources at
$WELLKNOWN/submission-address $WELLKNOWN/policy equally.

# Policy flags

This format seems under-specified.  Note that while I consider RFC 9309
insufficiently specified, it is far better defined than this.

# Security

And because I can't help myself:

* This protocol does a proof of possession for the key.  It's not clear that
this is a necessary function.  It also appears to do a routeability check for
the address being claimed, which I believe is necessary.  Is there a security
analysis, akin to those done for the ACME protocol, that confirms that this
protocol is not vulnerable to the panoply of attacks that such protocols tend
to be vulnerable to?  ACME had some serious problems with its design, which
demonstrates the value of such analysis.

* I don't see any effort made to ensure that the operator of the HTTP server is
authorized to operate as an authority for information about the mail
infrastructure.  Other efforts in this domain use DNS records for this purpose.
 I realize that the problem statement said that this can be effectively "too
hard", but I don't find that persuasive.

* SHA-1 appears to play a significant role here.  Given that it has been shown
to have collision attacks and this protocol (as specified) would appear to
depend on collision resistance, that seems like a genuine problem.  If I
request keys for an address with a collision, how can I be sure that I'm
getting the keys I intended?  (Above, I suggested that hashing might not even
be needed; please consider that.)

* The other reason to use a hash function might be to discourage enumeration of
addresses.  (To be clear, it's not a particularly /good/ protection given the
low entropy of local parts.)  What measures, if any, can be put in place to
protect the privacy of addresses from crawling and scraping?

* I'm not sure that you should talk about failures in previous versions of a
document like this: "The use of DNS SRV records as specified in former
revisions of this document reduces the certainty that a mail address belongs to
a domain."  Besides, the opposite might be true, depending on the answer to the
authority question I ask above.

# Nits

What is "hu"?

I see a few places where references are in the form "as specified below". 
Please add section references instead.
Received on Wednesday, 18 February 2026 01:48:39 UTC