WebPKI and You
There’s been a push over the last twelve years to move web traffic off unencrypted HTTP to encrypted HTTPS, to protect the general public from dragnet surveillance, gaping assholes on public wifi1 Ironically this site has an expired cert, so I’ve linked the non-HTTPS version. There are incredibly not-safe-for-work shock images in some of the links there so you do not want to browse past the first page if “gaping asshole” feels like an odd phrasing and doesn’t evoke a specific image: http://gbppr.net/defcon/evilscheme/index.html , backhauls over unencrypted satellites, that kinda thing. HTTPS relies on a public key infrastructure to make sure only authorized servers have keys for specific websites.
This public key infrastructure isn’t just a bunch of servers and vaults in datacenter cages around the world. It’s a social and political system operated and regulated by several parties with conflicting goals.
Table of Contents
- The Basics and What We Expect from WebPKI
- A Brief History of HTTPS and WebPKI
- Why Expire or Revoke At All?
- How Do You Revoke A Hundred Million Certificates?
- Mitigating This Mess
- What Is To Be Done?
- Credits Roll
The Basics and What We Expect from WebPKI
The public key infrastructure of the web, commonly referred to as WebPKI, has to work in some difficult scenarios. Someone who’s never touched a trackpad relies on their ability to buy a new computer at the store, put it on the wifi at DEF CON, and connect to their bank’s website to kick off a wire transfer to buy a house with it.2 it was actually the airport on the way to DEF CON and a computer I’ve had for some years, but still felt sketch making a non-reversible transfer of a shitload of money The way this user’s bank, the First Example Bank of Money, proves that they’re the bank is complicated.
- When provisioning their server, the bank generates a private key and derives a public key from it
- The bank sends their public key and some proof
that they’re the legitimate operator of
bank.exampleto a Certificate Authority (CA) - The CA validates this proof and issues a
certificate containing:
- server addresses this certificate is valid for,
bank.exampleandwww.bank.example - the public key from step 1
- how the validation happened
- who the CA thinks requested the certificate, “First Example Bank of Money” of “Example Locality” in “Example Country”
- information about the CA’s certificate process
- how to check if the cert’s valid
- the CA’s certificate (or a chain of certs) leading back to a root certificate
- a signature from the CA’s private key
- some surprise tools that will help us later!
- server addresses this certificate is valid for,
- The bank loads this certificate into
the
bank.exampleserver next to the private key - When the user connects to
bank.example, the server sends this certificate to the client as part of establishing a secure connection. - The user’s browser
validates the certificate matches
both who they’re connecting to
(i.e. that the certificate lists
bank.exampleas one of the subjects of it) and that the certificate is chained back to a root CA the browser recognizes
There’re a bunch of participants in this process, in different organizational domains because security is complicated.
In the organizational, regulatory, human domain:
- the user is a relying party, they rely on all this shit working so they can live in a society
- The First Example Bank of Money is a subscriber to certificate services;
- they subscribe to certificates from a certificate authority or CA
- CAs are generally a root that own one or more CA certificates
- the browser or operating system has CA certificates for some roots loaded into it by the organization distributing it; these are root programs
In the hell of x.509 certificates:
- the user is represented by their client, a web browser or mobile app or that kinda thing
- a subscriber is going to have a server that’s the subject of a certificate
- their certificate’s issuer is the CA
- confusingly, the CA is the subject of their cert, which has an issuer of a root CA
- browsers/OSes keep root CA certs in a certificate store
This is all held together by mathematical rules that get changed every 5-10 years as new weaknesses in any of the layers (political, organizational, mathematical) are discovered or publicized, but it kind of comes down to prime numbers being sufficiently magic that CAs can extract rent for doing some math with them.
A Brief History of HTTPS and WebPKI
When HTTPS started to be a thing in the mid-‘90s, it was really just a thing for the very nascent e-commerce industry and banks and such; and even then sites wouldn’t do their general traffic on the HTTPS site, instead kicking you to the secure server once you’d finished filling your cart with actual books written by human authors3 and no SIXLTR garbage that breaks after six months yet on that hot new online bookstore. Browsers started to put locks and such in the UI4 old version of MS Internet Explorer would pop up a scary warning when you went to an HTTPS site, which is ridiculous , and they just kind of made up rules for who could be in their root store in ways to not have too many CAs but not also make too many governments mad.
Eventually, after about a decade of this, a bunch of CAs had entered the market, resulting in a situation where “any fraudster with $20 in his or her pocket could buy an SSL certificate”5 Abdulhayoglu, Melih “Extended validation and online security: EV SSL gets the green light” INSECURE Magazine, Issue 19, Page 40, December 2008 , leading Melih Abdulhayoglu, founder of the Comodo CA, to introduce two new things. One was Extended Validation (EV) certificates, that require a more stringent validation process and more importantly let CAs charge more money. The other was the CA/Browser Forum (CA/BF or CABF), a place for CAs and Browsers to come to an agreement that CAs issuing EV certificates should follow that more stringent process, and that EV certificates should get special UI in the browser to upsell subscribers to them on the basis that relying parties will look for the EV UI.
Certificate Types
The difference between Domain Validated (DV), Organization Validated (OV), and Extended Validation (EV) certs is worth discussing.
Domain Validated certs require some kind of process to make sure they’re being issued to the operator of a given domain. With Let’s Encrypt and other CAs that use the Automatic Certificate Management Environment (ACME), a subscriber/subject demonstrates control of a domain by either setting a “challenge” DNS entry on it or responding to a request for the challenge from a running web server on it. DV certs just say the domains they’re validated for, nothing about the subject/subscriber/server’s locality, country, or identity.
Organization Validated (OV) certs come with more attachment to the subscriber they’re issued to. This has to be verified similar to how you’d validate your identity when getting a library card or driver’s license.
Extended Validation (EV) certs have their own special set of requirements, with more validation of the subscriber’s identity. To justify this, the CA/BF got browsers to add special UI for EV certs, usually something green with a padlock and the organization name from the server’s cert. The theory for browsers was that everyone was supposed to be looking at the browser chrome to validate the site they thought they were on, EV certs came with enough information to trust that, and that people would definitely notice the absence of that additional UI element and not get bamboozled by a picture of a lock on the web page. This was wrong btw, nobody6 ok statistically very few people, if you ever noticed that a site had presented a valid cert that wasn’t EV when previously it had an EV cert here’s your gold star: ⭐️ . ever notices a UI element that isn’t there, and browsers eventually got rid of it in their unending quest to make the address bar more and more worthless.
In 2026, you can see what validation a cert uses deep in your browser’s UI. In Safari on Mac OS Tahoe, you have to pop open the web inspector, go to the Network tab, reload the page, pick the request for the page, and switch to the Security tab. Show the full certificate, look for the “Certificate Policies” extension with OID7 an OID or “Object IDentifier” is intended as a unique identifier for a notional object, in a big hierarchy. If you think this is nonsense for sickos, congratulations! IMO it’s not like SQL where it’s one of those things you can build a career on. All the CABF OIDs are at https://cabforum.org/resources/object-registry/ , and you can probably look them up in the Baseline Requirements (BRs). 2.5.29.32, and then look up the OID listed there, 2.23.140.1.2.1 in this case:

Certificate Transparency
HTTPS certificates are an incredibly powerful tool to safeguard the communication between a relying party and a subscriber, a user and a company, a person and a website. This protection has been carefully designed to be difficult to violate for any existing organization, including powerful Nation-State Adversaries. This relies on CAs doing their job and operating honestly, but this requires CAs to follow rules and not succumb to various temptations or impositions.
In July of 2011, the DigiNotar CA got caught having issued a fraudulent cert for Google, which was like the website at the time. They got caught immediately because Google Chrome shipped with “pinned” certificates for Google and a bunch of other websites to prevent this kind of thing from happening. This wasn’t workable for the vast majority of people running websites, and eventually the “Certificate Transparency” (CT) system was developed.
When a CA is issuing a cert now, they send most of the cert as a “precertificate” to some public Certificate Transparency logs, get a Signed Certificate Timestamp (SCT) back, put that SCT in the real certificate, and issue that. Those log entries can be observed by the general public, site operators that might be concerned about fraudulent certs being issued, CTF players trying to figure out what future challenges might be unlocked, attackers trying to map an internal network that uses public certs for resources, whoever. Modern browsers don’t accept public certs that don’t come with a SCT, mitigating the risk of a misbehaving CA covertly issuing unauthorized certs.
Also lol @ DigiNotar losing their ability to rent-seek a prime number over this.
Expiration and Revocation
Sometimes bad certs are in the wild. There are three ways to handle getting these out of circulation.
The original way is a Certificate Revocation List, or CRL. Certs normally have “CRL Distribution Points” listed as part of their content, you can download them, and check a given cert for presence in that list. The problem is, in a mass revocation event, these lists can be obnoxiously huge. A CA can figure out the best way to balance the number of certs in a CRL vs. how many CRLs they want to maintain, but regardless, it’s not like browsers can afford to pull the whole list on every hit.
Instead, there’s an Online Certificate Status Protocol (OCSP) that used to be supported. The original idea was that instead of fetching a list that could be kilobytes, megabytes, or longer; a browser could just ask the CA if a given cert was valid.
There’re problems here though. What if the browser can’t hit the CA? Does making a CA unreachable become a cheap and easy way for an attacker to DoS a site, or a way for an attacker to get away with a revoked cert for a given target?8 Jeff Johnson’s Apple Developer ID OCSP post documents a case outside of WebPKI when Apple’s OCSP system, which was being used to allow them to disable malicious Mac apps, failed in a way that kept any app from launching. Also, this turns an OCSP responder into a very valuable watering hole to surveil users’ behavior en masse.
One solution is “OCSP Stapling,” where the subscriber’s server periodically checks on its own certs, and “staples” the (signed, untamperable, expiring) OCSP request to its cert when establishing a connection. This requires a “Must-staple” attribute on the certificate to be useful, since otherwise an attacker with a revoked cert can just not staple the “yo this is revoked” OCSP response.
An option that’s still in use is for the browser vendors to pre-process the fat and standardized CRLs into a more compressed form that’s cheaper to update and check against. Chrome has had CRLSets for like a decade but they’re kinda fat, Firefox has had CRLite9 these use cascading bloom filters which are a cool data structure https://mislove.org/publications/CRLite-Oakland.pdf for a few years and it’s really slick and efficient.
In a world of automated cert management, the best method to get bad certs out of circulation is just making every cert short-lived. The Baseline Requirements10 the CA/BF baseline requirements define a Short-lived Subscriber Certificate as 864,000 seconds (ten days) before March 15, 2026, and 604,800s (seven days) after that. straight up don’t require revocation of short-lived certs.
Why is it the best? CAs love to push back on revocation because some subscribers (that pay the CA) claim they can’t just replace certs easily due to reasons. Short-lived certs expire often enough that automating their issuance is the only practical way to operate with them. Instead of a CA struggling to get their subscribers to accept the exceptional case of replacing their one-year certs due to revocation, subscribers are just always replacing their certs as a matter of course.
Why Expire or Revoke At All?
As a subscriber (website operator) there’s a lot of
PKI stuff that’s not intuitive or easy.
The way I learned to do this stuff involved lots of
openssl command line tool stuff,
and
files with names like server.key, server.csr,
and server.crt.
This worked for me,
but there’re a bunch of different ways
it can be a problem.
Some of these problems result in a certificate becoming invalid, which requires that it be revoked and re-issued.
Trustico Private Key Disclosure
One of the reasons you might want to unalive a certificate is because the private key for it becomes compromised.
While there are technical means to make this real difficult11 Amazon Web Services (AWS) and other cloud VM providers have “secure enclave” features that can provision a private key, get it into a cert, and use it with a web server to sign HTTPS requests all while holding on to the key. Barring novel attacks of course. , often times it’s expedient for business to make it real easy.
“Trustico” was basically a middleman between subscribers that needed HTTPS certs for their websites and certificate authorities that might enjoy this kind of reseller scavenging a bit more business for them to rent-seek a prime number.
As near as I can tell12 i mean, i remember from when it happened because it was really fucking funny, but now i need to source my quotes :
- Trustico sold a bunch of Symantec certs for years. Subscribers would give them money and (probably) get a cert, a private key (that Trustico generated for them) instructions for Apache, and some kind of available support.
- Google said they were gonna remove Symantec from Chrome
by October 2018.
The quick version is it looked like
any asshole at Symantec could issue real
certs for whatever,
including
example.com(which is basically not a valid domain name) andwww.test.com(which is a valid domain name!) and*.test11.com(definitely valid!) which, lmao, seems like they could also get a new cert formicrosoft.comas a joke. It’s good they lost their business for this. - Symantec sold their cert business to DigiCert (a CA that’s been around forever) so that subscribers could continue to get certs.
- Trustico decided they needed to reissue thousands and thousands of certs as DigiCert because they said “Symantec” and all the normal users that click the padlock and closely inspect the cert chain would freak out and Trustico’s customers (subscribers) would be mad.
- DigiCert said “they’re fine, they’re not compromised.”
- Trustico emailed the 23,000 private keys they had, unencrypted, to DigiCert, which meant “they themselves compromised the keys of their customers, and revocation is both correct and necessary.”
- DigiCert basically emailed the subscribers (that DigiCert would issue certs for but Trustico scraped some profit off of) saying “hey we’re revoking your cert, you might want to talk to Trustico” and that maybe felt a bit sales-y when it got back to Trustico.
- Trustico’s website went down after someone posted a vulnerability it had on twitter
To me, this is a situation where a revocation absolutely had to happen. Maybe not because Symantec sucked at being a CA, but for sure because Trustico sucked at being a parasite13 You may have realized that I don’t think being a landlord for a number is a “business” as much as it is a public service. Being a middleman in this setup that induces friction for both customers and the general public in a way that implodes the company is something beyond incompetent. .
Entrust’s Certification Practice Statement
On March 6, 2024, Entrust, a Certificate Authority, posted an issue about some certs missing a link to their Certification Practice Statement (CPS). This is kind of a brown M&Ms type of issue, in that it’s an important opportunity for a CA to demonstrate competent handling of an issue.
A few days later, on March 13: someone comments that they didn’t actually fix the problem, and that they should’ve fixed the problem and revoked by now. Entrust’s representative says they won’t fix the problem and don’t need to revoke because they don’t want to.
We have not stopped issuance and we are not planning to stop issuance or to revoke certificates issued, we do think that this miss alignment between baseline requirements and the EV guidelines was an unintended oversight of SC-62v2 as explained in the root cause analysis. Revoking these certificates would have unnecessary big impact to our customer and the WebPKI ecosystem overall.
On March 18 a member of the Google Chrome Root Program comes back from some time off, says that Entrust’s response “fails to meet our expectations” (citing a laundry list of reasons and asking a laundry list of questions), and suddenly Entrust is stopping issuance and making revocation plans. On March 20, after some struggling to identify the certs that need to be revoked, Entrust starts revoking the certs. This delayed revocation (“delrev”) gets tracked in a separate issue.
As of April 19, 2024, Entrust has revoked 10,013 of the 26,668 certs; fewer than half, a month after deciding to start. On May 15, 2024, Entrust confirms that 95% of customers have completed revocation, but still have 9,906 certs remaining; just over 62% have been revoked. Entrust reveals that they’re providing “limited exceptions” to financial institutions, government agencies, information technology shops, and airlines.
In their May 29 update, Entrust only claims 74.7% revocation, and that “our subscribers are generally describing critical infrastructure,” while their CPS prohibits issuing to “any application in which failure could lead to death, personal injury or severe physical or property damage.”
Wayne, an independent Relying Party (i.e. a member of the public), explains:
You cannot have it both ways. Either a subscriber is critical infrastructure and thus forbidden to even have a certificate issued by Entrust, or they are not and should have been revoked already. Please provide a per-subscriber breakdown of what exactly you mean by critical infrastructure and the subscriber’s risk.
To me, this is the moment where we need to revisit the role of WebPKI, and alternatives to it.
TLS is not The Web
“The Web” in 2026 means roughly what it did in 1995; you use a web browser to connect to and use a site over the public internet, using a derivative of the original HyperText Transfer Protocol (HTTP) called HyperText Transfer Protocol Secure (HTTPS). However, lots of other stuff speaks HTTPS, or the Transport Layer Security (TLS) security protocols that make HTTPS different from plaintext HTTP.14 I’m not going to hit you with a seven layer OSI burrito here but: HTTP is what web servers and browsers speak over their connection, TLS is spray-on security for any random connection between computers, HTTPS is HTTP+TLS, and this is subtly wrong because it’s complicated because doing it the simple way had problems. The OSI model is also subtly wrong for the same reason my explanation is lol.
A phone app that talks to a bank is almost certainly using HTTPS, but it’s probably hitting parts of the bank’s servers that you can’t from a normal browser. The bank’s frontend server is probably talking HTTPS to the backends it balances the load across, but you probably can’t hit the backends from the public internet. The bank’s backends are hopefully using TLS to interact with databases and other internal services at the bank, but they’re probably not speaking HTTP, and probably not in a way that’s reachable from the public internet.
WebPKI is, fundamentally, a responsibility from certificate authorities and subscribers (who have resources!) to the general public (who rarely know what an “Euler totient” is). If you can shirk responsibility, you should15 in every part of your life! . This means that, for non-web uses, you don’t need WebPKI for TLS.
In the above cases, it might be expedient for the phone app’s endpoints to be on WebPKI, but it would also be possible for the phone app to bring its own certificate for a private CA controlled by the bank that, instead of needing to uphold the trust of the public at large as administered by web browsers’ root programs, it’s only responsible to the bank and its customers. The frontend and backend servers are probably administered by the same organization within the bank, so they definitely don’t need that connection to be on a public CA, and they could stand up a private CA and probably improve their security posture. The backend talking to other services within the bank over non-HTTP extremely doesn’t benefit from a public CA, and super doesn’t benefit from having every backend service show up on a Certificate Transparency log, so the same thing applies.
And it’s not just banking! Many kinds of critical infrastructure shouldn’t be on the public internet, and could trivially be on a private CA that doesn’t have to deal with the CA/BF, the CA Program part of Bugzilla, or browser CA root programs.
These private Certificate Authorities can be a huge production with expensive hardware security modules and issuing ceremonies and that kinda thing, or they can be a set of scripts someone runs on their workstation, depending on organizational and compliance requirements. It’s a bit more work to administer, and if it’s something workers will need to interact with a bunch it might be a pain to get the CA’s root cert installed across the right parts of the organization, but it’s better than the alternative.
Entrust is not WebPKI
Entrust’s June 11 update (three months after the inciting incident) basically conflates that while subscribers aren’t allowed to get certificates for “uses prohibited in [their] CPS” (Certificate Practice Statement), Entrust will go to bat for “subscribers who operate in critical infrastructure sectors” that don’t feel like they can handle replacing one file on their servers. They did finish though!
On June 24, “Zacharias,” presumably another relying party, notes that while many of the certs that were part of the whole delayed revocation incident because “the certificates are used in critical infrastructure and cannot be safely replaced prior to the revocation deadline,” they were for internal development and test servers that weren’t on the public internet, didn’t have public DNS names, and never got new certs after revocation. In a reply on June 30, we learn that this is because subscribers would just say, when told that they had 1684 fucked up certs, “we’re an airline you’ve heard of and we think it’s not safe to replace the cert file in five days” whether the certs were for a private test server or their main dot com site.
Among some of the other posts is a feeling that Entrust failed to implement operational changes in cert issuance that they committed to years prior. At times in their answers to relying parties, (again: the general public, people like you and me that rely on CAs to make the internet safe to use) Entrust felt obtuse and at best maliciously compliant.
On June 27 Google Chrome said that Entrust would be out of their cert program at the end of October 2024 (later amended to a few days later in November 2024 to line up better with a larger Chrome release).
On February 19, 2025, 352 days after receiving a report about their busted certs, Entrust posts their final report on the resulting delayed revocation issue:
Incident Report Closure Summary
- Incident Description: Delayed revocation of EV TLS certificates with missing cPSuri.
- Incident Root Cause(s): Certificate revocation was delayed to minimize harm to the Subscribers and Relying Parties.
- Remediation Description: Our revocation position changed and it was decided to revoke all non-expired certificates.
- Commitment Summary: Entrust has sold its public certificate business to Sectigo and is in the process of developing a transition plan with Sectigo. In the interim, Entrust remains committed to compliance with the Baseline Requirements, including strict adherence to 24-hour and 5-day revocation requirements in the event of a mis-issuance.
How Do You Revoke A Hundred Million Certificates?
In April 2025 Microsoft discovered that their sloppy copy-paste work had made an invalid Certification Practice Statement (CPS) in a prior revision, and they’d issued a hundred million or so certs from July 2024 to April 2025 under that statement. It’s a difficult number of certs to deal with. The process for revoking the 75M or so that hadn’t expired already would involve making a bunch of 650MB Certificate Revocation Lists (CRL) that simply wouldn’t work, because your phone shouldn’t be downloading entire CDs worth of shit when you’re just browsing the web. In May 2025 Microsoft had to come up with a plan to address how they would handle delaying this revocation.
For certs that take less than a week to expire (“Short-Lived Subscriber Certificates”), issuers barely have to think about revocation. The BRs (Baseline Requirements) call that out specifically. Beyond that, issuers can either revoke individual certificates through a CRL, or revoke the intermediate certificate with a CRL.
The former is precise for when some subscribers need to
revoke for key disclosure or Cessation of Operations
or whatever.
Since the CRL address for a cert is included as part of the cert,
at time of cert issuance a responsible issuer will
“shard” the CRL.
At time of writing,
the CRL for https://blog.brycekerley.net
is at
https://r12.c.lencr.org/119.crl ,
and hosts 5908 revocations in 217kb.
(Presumably there’re also CRLs 118, 120, etc.)
This trades off some locality for CRL checking
(since checking the CRL for a new cert is less likely
to hit a CRL you’ve seen recently)
but makes it possible to revoke
17,321,684 certs without making a 650,000,000 byte file.
It also trades off some privacy.
If the certs for
example.invalid, cdn.example.invalid,
and api.example.invalid
are all in the same CRL as ten thousand other domains,
it doesn’t give away much.
If the certs for those sites
are in three different CRLs out of a few dozen,
a client requesting them at the same time
would allow a CA to see what sites a client hits16
Is this a bloom filter
shaped problem?
Normally you hash entries and set those bits hot
in the filter,
or then use that to check for membership later.
Checking to see what bits hit an entry isn’t
really feasible if there’re a zillion entries,
but going through your logs and user by user
seeing which ones hit a specific site should be
doable?
Coming soon to a programming test during an
interview near you…
.
Revoking an intermediate cert invalidates every subscriber cert it signed. This is probably the right thing to do in this case, but similar to Entrust, Microsoft didn’t want to inconvenience their customers by representing the public interest.
When we all forgot how to count to one hundred three million, four hundred sixty-nine thousand, seven hundred sixty-four
The initial count given to the public was 103,469,764. In the delrev incident it was revised down to 100,322,979.
We see numerical oddities continuing down the delrev incident. The first cut of the June 20 numbers had the trademark sloppy copy-paste work, with the previous week’s “the number of certificates that have not yet been revoked” number. After Wayne (relying party, mentioned above in the Entrust section) asks about the numbers on June 27, Microsoft corrects the figure, and starts describing the numbers differently, possibly with some effect on the semantics of the numbers.
It persists though; on November 8, Wayne again asks for clarification on the last couple months of numbers and the final batch size. Microsoft’s answer uses “estimate” and “variance” enough that Zacharias, presumably another relying party, immediately asks “How did you underestimate the size of the final batch?” Microsoft answers:
We underestimated the size of the final batch because the CRL drop-off rates were estimations rather than exact calculations (drop off rates are particularly difficult to precisely project given the volume of certificates and frequency of revocations). The Certificate Revocation List reached the 15 MB operational limit mid-week, and did not drop as quickly as projected, which prevented additional revocations until entries age out and space becomes available.
I’m baffled by all of this. There just aren’t that many certificates that a computer can’t count them. There’re a hundred million certs affected by this event, too many for an Excel file sure but I wouldn’t think twice about spinning a cloud VM or database with my employer’s money to analyze these data.
For this kind of delayed revocation event, CAs should probably be expected to provide a standardized set of cert counts, possibly in a table that just gets longer weekly instead of an error-prone post that gets copy-pasted?
Slow Rolling
On August 1, 2025, Microsoft confirmed that they were prioritizing the revocation of certs that would expire sooner, which would cycle them out of the CRLs faster, but also mean that invalid certificates linger longer.
August 6, 2025, JR Moir (who I believe to be a relying party) comments:
At what point do we realize Microsoft have no intent to meet revocation timeline and just slow-walk revocation. More are expiring than revoked. CRL size is simply excuses.
Microsoft continues to spend several months making weekly reports of revocations and expirations, but on November 14, 2025, one day before their planned completion date, they suddenly decide that actually they need four more months to revoke the remaining 85,620 certificates.
Did they fill up their CRLs in a way that 85,620 invalid certificates would be left out in the cold for four months after successfully revoking 15,136,769?
Wayne probes further; by this point in my understanding of the situation it really seems like Microsoft doesn’t have a handle on the certificates they issued and that they’re planning to revoke.
At no point in any step of this process should estimates be involved, all of the information is at your disposal. The most generous read I can give is that 90% of the CRL list was set aside for this revocation and the other 10% was pushing past projections. Again, something that could be communicated in this incident in a clear and transparent manner, but I am instead stuck with guessing as to what the CA has been doing these past 7 months.
When Will it End?
At time of writing, it’s February 9, 202617 At time of publishing, it’s March 8, 2026 and this situation is dumber and more baffling than it was a month ago. Since this issue is still going (and I would argue going badly for everyone including the public) I’m going to keep this section as a snapshot of early February. . Microsoft’s delrev incident still has 8000 certs live, with a month planned to revoke them. Not only can Microsoft not count their certificates, they’ve also demonstrably been unable to count the eight action items in their weekly reports. Maybe their final report in March 2026 or maybe July 2026 or whenever the fuck whatever will have some answers, or maybe it’ll just be AI slop like what they’ve been accused of posting over the course of the incident.
And as a relying party, what does any of this mean for us? Chrome seemingly hasn’t poked at the incident since September (when they asked about the gap between planned and actual revocations, to which Microsoft eagerly decided to change their CRL limit from 10MB to 15MB to meet this), and they’re basically the only party that effects positive change in the WebPKI world.
Mitigating This Mess
What happens when a root program decides to distrust a CA? The current state of the art used by Chrome in the Entrust incident is to, for a given CA, set an upper bound on the Signed Certificate Timestamp (SCT) that certs have from Certificate Transparency (CT) logs. This value comes from third parties outside the CA’s trust boundary, which means a CA can’t just backdate a cert.
I think there’re more opportunities for the system to act on behalf of relying parties.
More CA Restrictions
Different places have different rules about how business works online, and that’s okay. Country A might require a CA to issue the government a cert for a subscriber that’s outside their remit, in order to spy on user traffic. CT does mitigate this to an extent (as long as someone’s looking?), but root programs could also restrict a CA in Country A to subscribers in that country.
CAs that are demonstrably incompetent at revoking certs could stand to be in a short term penalty box too.
I would argue that CAs shouldn’t operate root programs. That’s how you get a conflict of interest that lets a too-big-to-fail root program defending an incompetent CA.
Future Certificate Restriction
HTTPS has a Strict Transport Security feature (abbreviated “HSTS” for “HTTP Strict Transport Security”) that allows a site to declare that it’ll be HTTPS only for the next x seconds.18 This is generally a hardcoded value, often 31,536,000 seconds/one year or more, with the effect that every single request resets the clock to a year in the future. This prevents browsers from falling back to HTTP even if an attacker blocks HTTPS connections.
There might be value in a website operator providing a declaration that certificates for the next x seconds will be OV or EV, from specific CAs, that kinda thing. This would prevent an attacker that can compromise the website’s DNS from getting a useful cert through the much simpler Domain Validation.
Subscriber CAs
Let’s say I have a bunch of ephemeral
cattle19 “cattle” is when there’s
a collection of bulk-managed
computers with random names that you don’t
care about,
as opposed to “pets” that you name after
Star Trek ships
servers running example.com,
each named something like safdjkl67.web.example.com,
that get set up and torn down pretty often.
As it is today,
they either have to:
- share the same cert and private key that get passed to them on initialization
- do a dance where they each get a different cert and private key but also each take up space in a centralized CT log
The secret passing and CT bloat could be solved
with a “Subscriber CA.”
Through the normal ACME process,
instead of just a cert for
“web server authentication,”
a coordinator node for the herd of cattle could get
a CA cert for example.com.
With that, the coordinator could
give each cattle a valid example.com leaf cert
without any private keys having to go over the network.
ACME Renewal Information
ACME Renewal Information (ARI) exists! It’s good!
The normal ways for an ACME client to get a certificate are:
- the admin schedules a recurring task (a “cron job” in UNIX world) to run the ACME client; monthly, weekly, daily
- the admin runs it once and then runs it again when users complain
In the first case, a sudden revocation or switch to a lifetime shorter than the check interval might fall through the cracks. The second case is all crack.
ARI enables an ACME client that gets up-to-date certs on behalf of a subscriber to ask if it should go through the process of getting a new cert yet. It’s designed to be cacheable and cheap for a CA to offer. It’s an interesting way to bridge the gap between longer-lived certs that have the risk of missing a revocation, and short-lived certs that require frequent re-issuance (which adds CT log load)
The ideal way to implement a client for this seems to be putting it somewhere it always runs, like embedded into the subscriber’s web server software.20 I checked Caddy, the front-end server I like to use because it does all the HTTPS stuff for you without having to think about it, and it doesn’t document ARI support.
- At server start, check ARI for all the certs
you have lying around.
If you’re currently in or after the
suggestedWindowfor renewal, renew them immediately. - Hang on to the
suggestedWindowandRetry-afterheader, and let your ARI thread sleep until it needs to check something again based onRetry-AfterorsuggestedWindow, whichever is sooner. - On your next check,
if the
suggestedWindowis now21 there may be value in trying to renew before your suggested window. This LE forum thread hasmholt, a Caddy developer, describing production deployments that require cert renewal scheduling days out to avoid running in to LE rate limits. , renew.
ARI seems like a nice compromise between the always-renewing short-lived world and checking monthly that leaves you vulnerable to a mass revocation that happens right after you get a fresh new cert.
Auditing
Certificate Authorities are required to be audited. There are a couple different auditing schemes but they both require CAs to hire their own auditors, introducing a very standard conflict of interest with very standard problems. Additionally, audits are a lagging indicator that won’t stop a problem from occurring as it happens, but merely identify problems that happened over the year since the last audit.
How many auditors are going to say “thanks for the money, but also you suck and shouldn’t be in this line of business because of something you messed up a year ago?”
What Is To Be Done?
The burning question here is what can we, as a bunch of random disorganized relying parties, do to make certificate authorities do their job?
I’m not entirely sure.
The separate realms of Bugzilla (which is where
CAs are required to confess their sins),
the CABF mailing list,
and Mozilla’s dev-security-policy list
are hard to find and get in to.
As someone who only started posting on any of these
shockingly recently,
it doesn’t seem like they’re particularly effective
for relying parties unless the Chrome root program
happens to cast their Eye of Sauron in the right direction.
It also feels bad to blow up those spots with
a thousand hacker news dipshits and redditors
asking the slop machines to make their
too-wordy screeds 5% more indignant.
Maybe it just requires more posts summarizing CAs shortcomings to the aggregators?
Maybe all the CAs will learn how to do their jobs?
(audience laughter)
That really is it though. As an outsider, it really seems like the system has historically put the financial security of CAs first, which demonstrably led to CAs putting the convenience of subscribers above the safety of relying parties.
WebPKI subscribers may need to make some infrastructure changes. My recommendations are:
- Quit using world-facing web servers that can’t integrate with ACME. If it’s an appliance that can’t, Office Space that mf and find someone that can configure a web server.
- Move internal stuff to a private CA that you provision your internal machines with. Pushing a CA file into the right spot on a Linux box isn’t difficult at all with Ansible, and I think Windows AD stuff can do that too.
Root programs should probably yank more CAs, or at least build new tools to incentivize the ones that suck to fix themselves.
If you’re a technically sophisticated relying party that starts removing CAs from certificate stores on your own devices, all you’ll do is get inconvenienced. All you can really do is complain.
If you’re not a technically sophisticated relying party, you made it to the bottom of this post and you kind of are now?
Thanks for reading!
Credits Roll
Thanks to the following people for a ton of useful advice and editing notes:
- Ian Spence
- Josef Diago
- Josh Yotty
- Lily ‘zap!’ Baker
- Mike Shaver
- r4v5
- all those anonymous cyberpunks who fight against injustice and corruption every day of their lives