Marks Polakovs.

The Life And Times Of HTTPS

Cover Image for The Life And Times Of HTTPS
Marks Polakovs
Marks Polakovs

Today I gave a talk titled "The Life And Times Of HTTPS" to HackSoc, the Computer Science society of the University of York. Below is the video of the talk, as well as an (almost, apologies for any inaccuracies!) full transcript.

The Life And Times Of HTTPS In late October 2010, a Firefox plug-in called Firesheep was released. It had a very simple premise: walk into a cafe, connect to the (unsecured) WiFi network, start it up, and watch. If anyone else connected to that WiFi network and logged into Facebook, Google, or Twitter account, or did anything over an unencrypted connection, as most were those days, Firesheep would sniff out the credentials sent over the air, and with just a click of a button, you were logged into Facebook as the random passer-by whose only crime was to connect to a public Wi-Fi network.

But when was the last time you worried about typing your login details on a public Wi-Fi network? And should you be? The answers to those questions are “not in the past few years probably” and “not really”. And that’s because of HTTPS. Today, ten years after Firesheep, Firefox’s telemetry reports that over 80% of all connections it makes are encrypted, and most of the websites you use on a daily basis have completely switched off unencrypted connections.

The question I want to answer in this talk is “how did we get here?”. And I think the best way of doing that is to start from the very beginning, the early 1990s.

The core principle of HTTP, the protocol that underpins most of the internet, is simple. You open a connection to a website, send a request, it sends back the data for a page, and closes the connection. When you go to another page, you repeat this dance. The problem is that your requests and the server’s responses are just bytes routed across a network. Anyone else on that network can see those bytes, and anyone between you and the server can intercept them and tamper with them - both your request as well as the page data. This is not a great state of affairs. Let’s fix that.

The solution is simple: wrap it all in an encrypted tunnel. Encrypt the connection to the server, and only exchange data over this encrypted connection. But when you encrypt, you need a key - how do the client and server know each other’s keys? This would only work if the client and server knew each other’s keys beforehand, which might work in a small computer lab, but is not scalable to a worldwide network.

A possible solution here is to use asymmetric encryption - basically there’s two keys, a public key and a private key - and something encrypted with the public key can only be decrypted with the private key. So when you open a connection, the client asks the server for its public key, and encrypts all of its messages to the server to that public key - which only the server can decrypt. Let’s also have the client generate its own keypair, and send its public key to the server, to ensure its responses can’t be snooped on. Eagle-eyed viewers may notice that there’s nothing to stop an attacker from tampering with the public key - but that won’t do them much good, as all that’ll do is ensure the client and server can’t talk at all, which is a bit useless.

But there’s still a problem here, which is that if an attacker records the traffic to a server, and later steals its private key, it can decrypt all historical traffic, which is quite bad. This is known as Forward Secrecy - or in this case, the lack thereof.

What is used nowadays is actually symmetric encryption, but with a key that’s generated anew for every connection, and thrown away once that connection is done. That way, even if the server’s private key is stolen, the hacker can’t decrypt the prior connections.

So that’s how the keys are derived, and our connection is secured. There’s a fatal flaw here, though - in fact, there’s two. One I’ll come onto in just a second, but the second, arguably bigger one, is that we forgot to define a threat model.

A good security engineer always comes up with a threat model before designing a security system - it’s basically an understanding of what types of attacks the system has to defend against. Our threat model in this case is a very broad one - we’re assuming an attacker with complete control of everything between us and the server. In fact, the specification for TLS, the protocol used by HTTPS, states: The TLS protocol is designed to establish a secure connection between a client and a server communicating over an insecure channel. This document makes several traditional assumptions, including that attackers have substantial computational resources and cannot obtain secret information from sources outside the protocol. Attackers are assumed to have the ability to capture, modify, delete, replay, and otherwise tamper with messages sent over the communication channel. (RFC 5246 2008, section Appendix E)

(A note on terminology - the protocol that web encryption uses used to be called SSL, but got renamed to TLS due to trademark issues. HTTPS is just HTTP over SSL/TLS.)

If we look at that threat model, the fatal flaw that pops out is authenticity. While we’ve ensured that connections between us and the server are encrypted and cannot be tampered with, there’s nothing to stop an attacker with control of the network to redirect your requests to a server pretending to be the website we want.

To fix this, we essentially need someone else to vouch for the identity of the website. Who might that be? The web browser vendor? No. The point of the World Wide Web was decentralisation, and putting the browser vendor at the centre of deciding who’s legit isn’t the best of ideas. Instead, we introduce another player, the Certificate Authority. So instead of having the browser vendor decide who to trust, it delegates that to a number of companies, who then issue certificates to trustworthy* websites. (Trustworthy has an asterisk next to it - I’ll get onto that.)

It works a little bit like this: the server’s administrator generates a private key, which it keeps secret, and a public key. It sends that public key to the Certificate Authority, or CA. The CA verifies that the admin really controls the site they say they run, then attaches to the website’s public key a bit of information about that site, itself, and an expiration date, and signs all of that with its private key. This signed bundle is called a certificate, and it gives that certificate to the website admin. (The CA also has a certificate, called a root certificate, tied to its private key.)

Then, when anyone connects to that website, the server sends them this certificate. The browser checks that the key that signed it is from the root certificate of a CA that it trusts, the certificate is not expired, and it’s valid for the website the browser thinks it’s connecting to. If anything seems amiss, it stops trying to connect and warns the user.

The server and browser then somehow agree on a key to use to encrypt the connection - there’s a number of ways of doing this, the most common these days is what’s known as Ephemeral Diffie-Hellman, which I don’t have time to explain here but look it up, it’s insanely clever. Finally, once the browser has authenticated the server, and both sides have negotiated a key to use, the connection becomes encrypted and you get the green padlock in your browser - or at least, you used to. More on that later.

The key lynchpin in this protocol is the private key: it must be kept secret, or the jig is up. In case the private key is compromised (or the CA later finds out that it mis-issued the certificate), the certificate can be revoked by the CA, telling browsers that that certificate should not be trusted. At least, in theory. Again, more on that later.

The CA’s root key must be kept even more secret, because if an attacker controls it, they can issue a certificate for any website, ever, trusted by all major browsers - and if it were to get revoked, it could take months or even years to get all browsers to trust the CA again. To keep them safe, the root keys are kept in Hardware Security Modules, or HSMs, which are purpose-built devices that let you sign things with a private key without having access to the private key - and the HSMs containing the root keys are kept offline, in a safe, in a vault, with access subject to strict auditing rules.

But, we haven’t quite solved the problem yet. We’ve punted the trust problem over to the CAs, but how do we know which CAs to trust? At the time of writing, Firefox trusts 148 root certificates from 50 different CAs - how do Mozilla know they’re all doing things right, and how can new CAs get trusted?

This is where an organisation known as the CA/Browser Forum comes in. As the name suggests, the CA/B Forum is a meeting of the main browser vendors and the CAs they trust. Together they produce a document known as the Baseline Requirements, which are the rules that all CAs have to follow if they want to be trusted by a browser.

The BRs are 96 pages long and specify, in precise detail, exactly how the certificate lifecycle is meant to work, how they’re allowed to validate domain control, how key material must be stored, when certs need to get revoked, etc - and to ensure that CAs are still complying with those rules, they must undergo a yearly audit by a licensed auditor. If a CA is found to be misbehaving, or misissuing certificates, they can be distrusted, so all websites running their certificates show a red warning message.

Distrusting isn’t an option that’s taken lightly, but it has been exercised in the past - the two most notable cases being WoSign and Symantec. I’m not going to go into the details of why they were distrusted - you can read all about that on the Mozilla wiki - but the long and short of it is that there were a great many issues, both technical and management, which lead Mozilla and Google CA policy leads to no longer trust WoSign or Symantec, and gradually remove Firefox and Chrome’s trust in newly issued certificates.

There’s still a fundamental flaw here, but to fully understand the problem we need to look at two case studies. Recall that a CA’s private key is stored offline in a HSM, so someone who hacks the CA can’t get the private key - but it won’t stop them from issuing fraudulent certificates.

Enter DigiNotar. DigiNotar was a Dutch CA - note the tense. In July 2011 they “noticed an intrusion”, which resulted in the fraudulent issuance of a number of certificates, including ones for google.com. DigiNotar revoked all the fraudulent certificates - except one, which went unnoticed until August 28th 2011, over a month later, when a Google Chrome user in Iran noticed an alert for an insecure connection to google.com. This was possible not because there was something wrong with the certificate per se - it was an entirely valid certificate, trusted by the system - but because Chrome implements something called certificate pinning, which is where it has a hardcoded list of root certs that it trusts for Google domains. DigiNotar is very much not on that list, so Chrome sent an alert. The certificate was revoked a day later, but the damage was done - and, to make matters worse, DigiNotar had not told anyone about the initial hack, and now that the cat was out of the bag they revealed that anywhere between 247 and 531 certificates were issued by the attackers - for domains like Yahoo, Mozilla, WordPress, and The Tor Project. The browsers were now not sure whether they can trust any DigiNotar certificate, because it may have been issued during the hack, so they decided to revoke trust from DigiNotar’s entire hierarchy. To make matters even worse, DigiNotar was one of the five CAs in the Dutch government’s public key infrastructure programme, and revoking trust in DigiNotar meant that many Dutch government websites were inaccessible. As you might imagine, when you can no longer trust a company whose business model is built entirely on trust, that company can’t exist for much longer.

Let’s take a look at another case study, but before that, another bit of explanation. For a CA, keeping the root key secure is extremely important, as if it gets compromised your entire hierarchy is gone. So issuing certificates directly off of the root is far too risky. Instead, CAs create what are known as intermediate certificates, which have the power to issue other certs, known as end-entity certificates, which are the ones that are installed on websites. These intermediates are also kept secure, because losing one means an attacker can issue any certificates they want and have them be trusted by browsers, but the standards are slightly less strict than for the root, because the loss of the root is an extinction-level event for a CA - so, still in HSMs, but not in a physical safe this time.

So, back to the case study. On December 24 2012, Chrome detected, again through its pinning mechanism, an unauthorised certificate issued to google.com. Chrome alerted Google’s security team, who discovered that this cert was issued by an intermediate linked to a Turkish CA called TurkTrust. It turns out that back in August 2011 TurkTrust mistakenly issued two intermediate certificates, when they should have been end-entity certificates. The owner of one of these certificates noticed that it was an intermediate rather than an end-entity certificate and notified TurkTrust, who revoked it. The other was issued to the municipal government of Ankara, where the sysadmin installed it on a CheckPoint firewall. Remember that part of the principle behind HTTPS is that nothing between you and the server you’re connecting to can decrypt your connection - which includes firewalls and proxies, which some organisations may not like - think banking or government, for example. CheckPoint firewalls allow you to get around this by giving them a certificate and private key, which they will then use to intercept all HTTPS connections and re-encrypt them using that certificate, while being able to analyse the traffic. This is normally meant to be done using a private root, trusted only by the people who work at this organisation - but it seems like the Ankara admin used the mistakenly issued TurkTrust intermediate, which would have the power to generate certificates without needing to be installed on end user computers. So this firewall generated a certificate for google.com, which Chrome’s pinning picked up.

Both of these case studies highlight what is both a huge benefit and a fatal flaw of the CA ecosystem: any CA can issue a certificate for any domain. This means that you aren’t locked in to a particular CA for your domain, but it also means that there’s no way for you to control, or even know, which certificates are issued for your domain.

Both of these case studies were picked up by Chrome’s pinning mechanism, so you might be saying “why doesn’t everyone use pinning?” And indeed, there is a standard for this, called HTTP Public Key Pinning, or HPKP. The only issue is that HPKP is considered deprecated, and most CAs explicitly warn against using pinning. The reason for this is that it makes replacing a certificate difficult, if not outright impossible. Suppose your CA detects that it made a mistake in its processes and needs to revoke your certificate, which is rare but not at all unheard of. The trouble is that, to the browser, there’s no way to tell the difference between you legitimately replacing your site’s certificate, and an attacker impersonating your site with their own certificate - and, since pins are valid for up to a year, some people will be unable to access your site unless they manually reset the pin. Chrome doesn’t have this problem, because it can change the set of valid pins through a software update, but this doesn’t scale at all.

But let’s take a step back. Do we really need to control which certificates are valid for our site? Maybe just knowing about misissued certificates is enough, so we can get them revoked? This is where Certificate Transparency comes in.

Certificate Transparency, or CT, is a system, designed by Google security engineers, to make it near-impossible to issue a certificate for any site without at least someone knowing about it. It works like this: a number of key players in the Web ecosystem run CT logs - Google runs several, Cloudflare runs a few, DigiCert and Sectigo (two major CAs) run their own. Then, when a CA issues a certificate, it adds it to one or more of these logs, effectively telling the world that the certificate exists. Any certificate trusted by the major browsers needs to be added to these logs, for example, Chrome’s policy is that it’ll only accept a certificate that’s in at least one Google and one non-Google log.

These logs use a data structure known as a Merkle tree - no, not that kind of Merkel - which, without getting into too much cryptographic detail (you can read about it if you’re interested) prevents modifying or removing an entry in the tree once it’s been added without invalidating the rest of the tree - a bit like a blockchain, just without all the scams.

To avoid having to scan every single log to check that a certificate is valid, when a certificate is added to a log, the log returns what’s known as a Signed Certificate Timestamp - essentially, a signed promise to include the cert in the log within some period of time. These SCTs are attached to the certificate, and used by browsers to validate the cert. This is a potential problem, in that a log can give out a SCT but not actually add the cert to the log, but in practice this gets spotted quickly enough, and the standards to get a log trusted in browsers are very high, almost as high as root certs themselves. The final players in the ecosystem are auditors, servers which verify that the logs are behaving properly in every way. Essentially, if a certificate is in a CT log, there’s no way to not know about it, and since about 2018 inclusion in logs has been a prerequisite for trust in all the major browsers.

There’s still two more problems I want to address though.

First, let’s go back to revocation. When a CA revokes a certificate, it declares to the world that the certificate should no longer be trusted, for one reason or another, and that browsers should refuse to accept it.

The only problem is that revocation checking, as a concept, is fundamentally broken. That’s a strong statement, so let me explain.

There are two ways for a CA to tell a browser that a given certificate is revoked. The first is certificate revocation lists, or CRLs. They’re exactly what it sounds like - a list of every cert issued by a given intermediate that’s since been revoked. Most certs have a URL to a CRL embedded in it, and periodically the browser downloads it to check. The problem with CRLs is that, as the number of certificates in existence grows, so will the size of the CRL. After a particularly damaging vulnerability called Heartbleed was disclosed (the details are outside of the scope of this talk, but well worth reading about), tens of thousands of certificates needed to be revoked. Once they were all added to CRLs, one CA, GlobalSign, saw its CRLs grow by 4.5MB - which doesn’t sound like much, but, one, consider networks in developing countries, and two, the revocation generated 40Gbps of traffic to GlobalSign, which, if the CRL were stored on Amazon S3 (just as a benchmark), would cost900,000.

The other way is OCSP, the Online Certificate Status Protocol. OCSP works the other way round - instead of a CA giving a browser the list of revoked certificates, the browser asks the CA each time it connects to a website whether its certificate iis revoked. This solves the CRL size problem. However, both CRLs and OCSP have a fatal flaw - and to understand it we need to go back to the TLS threat model.

Recall that, at the start of the talk, I said that HTTPS is meant to work on an untrusted network, where an attacker can tamper with anything. Now suppose they tamper with the route to CA’s CRL or OCSP server, and kill all connections to it, so your browser can’t check if a certificate is still valid. It has two options. It can assume that the cert is invalid, and refuse to connect - which is bad, because the attacker has just killed your internet. Or it can assume that the cert is valid until proven otherwise - which is also bad, because now the attacker can use revoked certs without a care in the world. And this isn’t a theoretical problem - I bet you've been on a network that does exactly this hundreds of times. You remember this? Think back to every time you connect to a public Wi-Fi hotspot, and need to sign in before you can surf the web. That’s an example of a network that does exactly what I’ve described.

Revocation checking is so fundamentally broken that ever since as early as 2012 Chrome hasn’t bothered. So what’s the alternative? Well, for high value compromised certs, both Chrome and Firefox have a way to deploy a blocklist via software updates - Google calls it CRLSets, while Mozilla calls theirs OneCRL. But this isn’t remotely scalable. There’s another way known as OCSP Stapling, where the web server returns an OCSP response along with its certificate - but, due to backwards compatibility, it’ll never reach the levels of adoption it needs to be effective. It turns out that revocation checking is one of the big unsolved problems of this whole ecosystem - but there is a way around it, which I’ll get to in a minute.

We still have one major problem to discuss, though.

Historically, SSL certs have cost an amount of money, which has been a major blocker to adoption - until around 2008 only banking and e-commerce sites had encryption, because it wasn’t seen as necessary for everyone else, and it was only 2010 that Gmail was switched to HTTPS by default. Firesheep, the extension I mentioned way back at the beginning, was a major wake-up call to businesses, but Twitter only went HTTPS by default in February 2012, and SSL certs were still out of reach of many smaller businesses, and those that could afford them didn’t see it as being worth the effort. There was a company called Startcom that issued free SSL certificates, but their website was an utter pain in the neck to use (I remember using a StartSSL cert to secure my early personal website), and they used to charge25 to revoke a cert, which is a major security yikes, especially when the Heartbleed issue came to light and tens of thousands of certificates had to be revoked. Furthermore, Startcom was acquired by another CA called WoSign - which, due to a spate of management issues, was distrusted by all the major browsers. So, for the longest time, SSL was out of reach for most small websites.

And then came Let’s Encrypt. Let’s Encrypt is a CA that went out of beta in April 2016, that had a completely different business model to all other CAs to date. Let’s Encrypt is a non-profit, supported by Mozilla and the Linux Foundation, that issues certs valid for 90 days, automatically, for free. To say this was a seismic shift is an understatement. Suddenly, any administrator anywhere could get a certificate for free, by just running a command on their server. Let’s Encrypt issued its first 10 million certificates in September 2016, and in February of this year that number hit a billion. Meanwhile, the percentage of page loads using encryption in 2014, according to Mozilla telemetry, was around 25% - in October 2020 that number is closer to 80%.

And, since replacing certificates is completely automated, Let’s Encrypt can get away with reducing the lifetime of certificates to just three months - which is a way around the revocation checking problem - because even though you can’t check if a certificate is revoked, if it’s only valid for three months you’ve massively reduced the blast radius of a compromise. It doesn’t solve the problem, but it does mitigate it.

As you can imagine, Let’s Encrypt has faced quite a bit of opposition - the majority of it from CAs, whose business model is very much being threatened here. Comodo, one of the bigger CAs out there, tried some rather shady tactics back in 2016, filing a trademark for “Let’s Encrypt” and “Let’s Encrypt with Comodo” - that didn’t go too well for them. Quite a number of other CAs have been accused of sowing fear, uncertainty, and doubt about Let’s Encrypt’s practices. They do raise one criticism that’s worthy of discussion, though - which is, surely all these green padlocks for any website in the world are going to be abused by phishers and scammers?

In my view, that’s a false premise. The green padlock has never meant that the website you’re accessing is not scamming you, only that that site’s operator really does control the domain. I couldn’t get a certificate for paypal.com, for example, but there’s nothing stopping me from getting a cert for paypal-totally-not-a-scam-site-give-me-your-money.com. And that is how it should be - it should be up to the browser to decide whether a site is safe, not the CA. Trying to set a “scam site / not scam site” threshold is a losing battle, and quite rightly Let’s Encrypt haven’t bothered - they do have a list of high value targets, for which they will not issue certificates, but that list is relatively small.

What does need to be done is fighting the perception that “padlock = legit”, and Google’s taking a somewhat unconventional approach to this - by getting rid of the padlock altogether.

It’s been doing this slowly - first by removing the explicit green “Secure” text, then making unencrypted sites red and “Not Secure”, then making the green padlock gray - and eventually, removing it altogether, finally marking the era of HTTPS By Default. But let’s be honest, how many of us check that padlock before typing in our passwords? Research shows, not that many.

There are so many things I’ve simplified and glossed over in this talk. I haven’t even mentioned why Extended Validation really isn’t as useful as it sounds, the work being done to prevent snoopers from learning which websites you visit, and how Apple has strong-armed the industry into reducing certificate lifetimes from ten years to just one. I really do encourage you to read up on all of these, because there’s so much to learn about this ecosystem that we all use on a daily basis and never stop to think about.

If you’re watching this talk live, we’ll have some time at the end for questions on Discord - if you’re watching the recording, feel free to chuck questions or hate mail to marks@markspolakovs.me, or @markspolakovs on Twitter. This has been The Life And Times Of HTTPS, I’ve been Marks Polakovs, thank you very much.