https://letsencrypt.status.io/pages/incident/55957a99e800baa4470002da/5a55777ed9a9c1024c00b241
-
#1 jaas:
Josh from Let's Encrypt here. I'm not able to give many more details yet, but here's what I can add now:
1) This isn't a relatively simple issue like a bug in our CA code would be. It's an interaction between the protocol and provider services.
2) Disabling TLS-SNI is a complete mitigation for us, meaning it's no longer possible to get an illegitimate certificate from Let's Encrypt by exploiting this issue.
3) We have not yet reached a conclusion as to whether or not the TLS-SNI challenge will need to remain disabled permanently.
4) At this point we have no reason to believe that the vulnerability has been exploited by anyone other than the researcher who figured it out and reported it to us.
Our focus now is on sharing information with relevant parties and looking for less drastic mitigations that might allow us to restore the TLS-SNI challenge option to people who rely on it.
We will, of course, share more information as soon as we can. That might be as soon as the next few hours, things are moving quickly.
-
#2 pfg:
Interesting, definitely looking forward to the details, and great to see Let's Encrypt react this quickly even though this might cause a small amount of disruption to users.
The latest ACME draft - mostly referred to as what will become ACME v2, which Let's Encrypt supports on the staging environment as of a few days ago - has a slightly revamped version of the TLS challenge (tls-sni-02). The TLS-SNI challenge works roughly like this: The validation (CA) server sends a "fake" SNI hostname, generated by the CA server, to the IP behind the domain the CA is trying to validate. Domain control is assumed to be given if the server responds with a certificate that contains the CA-generated hostname in its SAN extension (where certificates store the domains and other identifiers they're valid for).
One of the concerns people had with tls-sni-01 is that it made it possible for a TLS server to "solve" such a challenge by effectively echoing back the requested SNI value blindly. This was changed in tls-sni-02 - just taking the SNI value and putting it in the SAN field is no longer enough to pass such a challenge. Until now, there was no reason to believe anyone was running TLS servers that showed this behaviour, so the was no real rush to deprecate tls-sni-01 right away (as opposed to just rolling out ACME v2, which only has tls-sni-02). I wonder if someone's found a lot of TLS servers that turned out to do this, or if there's some other vulnerability in the design or implementation.
-
#3 mholt:
Just wanted to jump on this for Caddy users [1]:
> Until further notice, when starting Caddy, we recommend using the '-disable-tls-sni-challenge' flag. This will require either HTTP or DNS challenges to be functional in order to renew your certificates.
By default, Caddy randomly chooses either the HTTP or TLS-SNI challenge to obtain and renew certificates. Your sites will likely not go offline even if you do not use this flag because Caddy tries up to 2 times per day, 30 days out, to renew an expiring certificate, as long as you keep it running. The chances that it would choose TLS-SNI sixty times in a row is extremely low. (We -- meaning myself and many people who contributed their feedback and code -- thought about these kinds of scenarios and Caddy is prepared to handle them.) However, since the TLS-SNI challenge will fail 100% of the time while it is disabled on the server end, might as well have the client not even try it.
Also note that all certificate maintenance routines are logged to the process log, so be sure you always run Caddy with the '-log' flag in production so you can see what's going on.
Since this outage may be temporary, check back later about re-enabling it. I recommend having more than one way to perform verifications when possible. (For Go programmers, the xenolf/lego library [2] supports all verification methods -- and is being upgraded for ACMEv2 currently; Sebastian is doing an awesome job! It also supports numerous DNS providers for easy setup of the DNS challenge.)
One more thing: wait for a full report from Let's Encrypt rather than speculating. Most questions can't be answered until there's more information. I don't think there's anything you need to do, no alarms to raise... just use another verification method until we get more info.
[1]: https://twitter.com/caddyserver/status/950926718004428800
-
#4 jaas:
We've now posted more details about the issue and our plans.
https://community.letsencrypt.org/t/2018-01-09-issue-with-tl...
-
#5 sk5t:
For certbot-nginx plugin users, I've had success with --webroot authentication:
For nginx: location ^~ /.well-known/acme-challenge/ { default_type "text/plain"; root /home/www/letsencrypt; }
Then, for SELinux: chcon -Rt httpd_sys_content_t /home/www
Reload nginx and 'certbot renew --webroot -w /home/www/letsencrypt' has a fighting chance.
-
#6 tialaramex:
Background history that might be helpful here:
The http-01 proof of control as originally defined allowed you to use HTTPS instead for the URL. This was never enabled in production because many bulk web hosts had a configuration where if anyone (say Let's Encrypt) asks for https://not-ssl-enabled.customer1.example/blah the bulk host's server will send over the answer for https://aaaa.ssl-enabled.customer2.example/blah because it just picked the alphabetically first SSL enabled name as default instead of giving an error for no match.
The Ten Blessed Methods don't say not to do this, but Let's Encrypt did not want a service which can be trivially exploited on common bulk hosts so they disabled it as they've now done for tls-sni-01.
I suspect a researcher has found a configuration that similarly causes an attacker to be able to pass tls-sni-01 for names using some shared infrastructure such as the same CDN or same web hosting as the attacker.
[Let's Encrypt posted a follow-up to their Discuss outlining exactly the above scenario but without the historical digression a few minutes after I wrote this]
-
#7 icing:
To the badass people who get their LE certificates with Apache mod_md: you have chosen well!
mod_md checks the challenge list from the ACME server and choses one that it supports. So, if your server listens on port 80, everything will continue to work. You do not need to change anything.
If your server is only reachable via port 443, there seems currently no way you can sign up with Let's Encrypt. You will need to open port 80 for certificate renewal/signup to work. Some Advice:
* port 80 needs to be available only during a renewal/signup. Once you have your certificates, you may close it again. You need to mind renewal periods then and should check your server logs more frequently.
* you can safely redirect your port 80 to 443 with the 'MDRequireHttps' configuration directive. This redirection takes automatically care that challenges from an ACME server are still being answered while all other requests are redirected.
In case you find issues or have additional questions, visit the github repository at https://github.com/icing/mod_md and file an issue.
-
#8 lunaru:
The shutdown of tls-sni-01 doesn't affect the http-01 challenge, so the workaround is to switch your code over to the latter if this is affecting you.
We're using Greenlock (https://github.com/Daplie/node-greenlock, previously node-letsencrypt via npm) for our app (https://Clearalias.com) and this library supports switching challenges fairly easily. It's even easier if you're just using an Express server, since you can use a Node library like Greenlock-express (https://github.com/Daplie/greenlock-express, previously known as letsencrypt-express), which makes it dead simple to use http-01.
Best of luck to anyone who's scrambling to fix their cert layer right now. It seems like there's a chance the TLS-SNI challenge stays disabled, so it's best not to hold your breath and instead quickly switch to a different challenge mode if you get a chance.
-
#9 lawl:
I'm actually glad they openly admit there's an issue when there's an issue. Waiting for the full report.
-
#10 jchw:
That's a pretty bad blow... A lot of Go software relies on the TLS-SNI-01 challenge, I believe. Will TLS-SNI-02 be a viable replacement? What should be done about servers currently using TLS-SNI-01?
-
#11 rconti:
I'm pretty annoyed, because when I first started using Let's Encrypt I was hamstrung by their restrictions on the various "automated" methods of deploying and renewing certs. I went with tls-sni because it was the least-bad method for my use case.
I listened with an open mind to their justification of the extremely short 90 day max cert length period in the "automate all the things world". The sysadmin in me was skeptical, even though I have also fought the "crap, we haven't renewed this cert in years and nobody knows how to do it anymore!" emergencies over the years, and understand how renewing frequently could, at least in theory, replace that problem with a lesser problem.
But, turns out my skepticism was justified. Thankfully I'm not using it in production yet, but too often these new projects and paradigms suffer too much from "what could possibly go wrong?" thinking, and you have to follow all the right forums and keep all your configurations in mind to know when a problem like this will bite your infrastructure. I only stumbled across this randomly while checking my Let's Encrypt community account for something else.
Now, granted, this just happened yesterday, but I missed the HN thread on it when it happened, which means I could well have missed it until it's too late and a bunch of certs expire. Then it's a scramble to fix your certs AND fix your automation all at once.
-
#12 ams6110:
Looks like Ted Unangst was right?
-
#13 SureshG:
Nice, acmi4j v2 (java client) already disabled this - https://github.com/shred/acme4j/blob/master/README.md#known-...
-
#14 komuW:
Last year I made a letsencrypt client[1] that only supports DNS mode of validation.
It currently only supports cloudflare and AuroraDNS, but it is very easy to use any other DNS provider[2]
1. https://github.com/komuw/sewer
2. https://github.com/komuw/sewer#how-to-use-a-customunsupporte...
-
#15 jcassee:
Unfortunately, this means that Traefik's default Let's Encrypt integration (without setting a DNS provider) does not work anymore. Although the logs now say "could not find solver for: http-01", they actually use tls-sni-01.
-
#16 Buge:
Are there any servers that automatically generate self signed certs for any SNI they receive? I think servers like that (if they exist) would also be vulnerable.