In this article, I am going to explain all of HTTPS. First, I am going to present how it works, what it protects and does not protect. Then in a future article, I am going to show you how to setup HTTPS on your website in 5 minutes only.
A small ‘S’ can change everything…
Two years ago, I noticed that I received notifications for comments I did not add to lefigaro.fr website. After finding my comments, reading their content and asking myself when I wrote this, I realized that someone hacked into my account! At that time, I had no idea how my nemesis did it… so I just changed my password and moved on. How naive I was!
A year ago, Google released a feature in its browser to show a “Not secure” label on all websites using HTTP instead of HTTPS. Even today, lefigaro.fr is still flagged as “Not secure”.
Thank you for the notice Google, but it is too late for me.
However, until I recently learned what the differences between HTTP and HTTPS are, I was unaware of all the dangers of using an insecure website.
Today, more than 45% of the one million most visited websites are also still using HTTP and that puts you at risk!
Everything my nemesis could have done
A question that is almost never asked is “Why do hackers hack ?”. Well, my answer to this question is that they have something to gain from it. In my case, my nemesis got inside my account and posted a few “funny” comments. He must have had a laugh or two while doing it.
However, he could have done much worse! He had access to my account which means he had all of the data that was accessible within the site: name, email address, postal address, age, etc. The more a hacker knows about you the easier it is for him to discover your passwords through social engineering.
When you login using an HTTP website, the hacker can see your login and password. Knowing that 52% of people reuse their password, this means that the hacker not only has access to the small forum site that uses HTTP but also your email, social media or even bank account.
Moreover, when using an HTTP website, everything the server sends back to you is also readable. A hacker can then transform your website easily. For example, a company could inject ads into your website for its own gain as shown in this article.
Nowadays, if someone has access to all of that, they could control your entire life. Stories about people that had their life destroyed in a second by a hacker are not hard to find. That is why you should always be cautious about what you do, specifically on insecure websites, because the effect that it can have on you is enormous.
Now that we have a small idea of what my nemesis could have done after hacking into my account. I wanted to understand how he did it for it to never happen again.
How my nemesis hacked me
If a website uses HTTP instead of HTTPS, any information being transmitted through the network can be seen and used against you. If you take a look at the image above, when you are logging in to your account on a website, you can see that the hacker sees your password when using HTTP whereas he can’t even see that you are communicating your password when using HTTPS because everything is encrypted.
That said, you can ask yourself why does he get access to the data that you are transmitting. Well, when you are using a public WIFI, everyone connected to it receives the same packets as you do because the packets are broadcast through the air. It is your computer that resolves which packets are for you and which are not, it does this using the MAC address. If you want to know more, check out this article.
In the schema above, you can see that computer 1 and computer 2 receive the information that is meant for computer 3. So if the hacker is computer 1, he can see the packets. He can collect your cookie, your session id, your token and much more. When you login, you will broadcast a packet to the router with your username and password. The hacker only has to listen to see your credentials.
However, when you use your computer you don’t get a popup every time your roommate goes on Facebook.
So, how did the hacker actually see the information ?
An application allows you to do exactly that. Wireshark is an open-source packet analyzer often used for network troubleshooting. In our case, it is used to analyze the packets that were sent from the router: we call this sniffing.
Above is an example of what a hacker can see using Wireshark. The UX/UI is not modern but with the documentation it can be an easy-to-use tool. However, knowing what to do with all of the info it gives you requires knowledge about Computer networks.
My nemesis must have used this tool to retrieve all of the information about my account. He only had to set an HTTP filter in the Wireshark interface to see only the packet sent with the HTTP protocol and get an easy look on everything I was doing.
To sum up, we now know what my nemesis was looking for and also how he did it. The only thing left is how do we make sure he never hacks me again. To get started on that we first need to understand the basic concepts of HTTPS.
How does HTTPS work ?
The first question I asked myself was: what is HTTP, HyperText Transfer Protocol? It is used to communicate information between a client and a server. It is transmitted via the application layer on the OSI model. The OSI model is a standard that can be used for computers to communicate between each other. It is used to characterize protocols and better understand how they interact.
When using HTTPS, HyperText Transfer Protocol Secured, the data being transmitted in the packet is encrypted.
How is the data encrypted and how do computers decrypt the data received?
There is in addition of HTTP a protocol which permits to secure the data being transmitted: TLS, Transport Layer Secure. You will also see SSL, Secure Sockets Layer, when looking up the HTTPS protocol: which is just an older version of the TLS protocol. SSL was created by Netscape, a private company, but it is now maintained by IETF, Internet Engineering Task Force, as an open source product. This explains why the name has changed from SSL to TLS.
The versatility of this protocol gives it the opportunity to be not only be used for websites (HTTP) but also in some mail or file exchange protocols (IMAP, FTP, SMTP, etc.). For our purpose, the only thing we need to know is that the most popular browsers (Chrome, Safari, IE, Firefox, etc.) have an implementation of TLS which permits them to use HTTPS.
However, when you need to use HTTPS on your server to connect to APIs you have to setup the connection yourself. Thankfully there exists a wide selection of libraries: HttpsURLConnection in Java, requests in Python, etc.
What do these implementations actually do ?
The TLS protocol uses what we call in networks a “hand shake” : First, they agree on the encryption methods. Then, they exchange securely a certain keyword known only by them. After doing all that, they generate the keys using the keyword to encrypt the data. I’ll try and explain those 3 parts below.
The “Agreement” phase (steps 1–2)
Step 1: In the schema above, you can see there are 8 steps to the handshake phase. The first one is the ClientHello message. The client sends a message to the server with a random number, a list of the versions of the TLS protocol it accepts and a list of the ciphers and compression algorithms it can use.
Step 2: After the server receives all of this information, the server sends back the ServerHello message. It uses the data, to respond accordingly with another independent random number, the protocol version, the cipher and compression algorithm chosen using the client’s preferences. It also initiates the session between the two parties and responds with the session id created.
Exchange of secret (steps 3–6)
Step 3: Immediately after sending the ServerHello message, the server sends its Certificate. A certificate can be compared to a passport. It includes various piece of information: name of the owner, domain, its public key, its validity date, etc.
Step 4: With all of this data, the client can now authenticate the server. It checks if the certificate is implicitly trusted or if several Certificate Authorities (CAs) trust this certificate. For example, YouTube uses Google Trust Services to sign their certificate. The client uses this certificate to make sure that the server is identified by an authority that it trusts as shown in the schema above.
For the next part you will need to understand the basics of asymmetric encryption. The primary notion of this encryption method is that you have a pair of keys: the Public key which can be communicated to anybody; the Private key which is only known by you. Data encrypted by your public key can only be decrypted by your private key and vice versa. This article explains in details how the keys are generated and why this method is secure.
Step 5: If the server cannot be identified, then an error is returned. However, if the server is authenticated, the client generates another random number and encrypts it using the public key sent by the server. This number is called the pre-master secret.
Step 6: After the creation, it is sent to the server. Since the server is the only one with the private key, it is also the only one who can understand the secret. This secret exchange uses RSA encryption. As you can see this encryption method is quite useful to send information between two parties that have not yet communicated with each other.
Generation of the keys (steps 7–8)
However, encrypting and decrypting larger amount of data takes a long time. Thus, RSA encryption is only used to communicate the pre-master secret and not the full request payload.
How is the rest of the communication encrypted ?
This is where the cipher algorithm chosen before is used. The encryption used is not going to be asymmetric but symmetric. Most browsers use AES encryption for the next part. It is a faster way to secure your data used worldwide by many applications (Whatsapp, Signal, Veracrypt, WinZip, etc.). The problem to this method is that both parties have to have the same secret key. Thanks to the RSA encryption, we solved this problem.
Step 7: The cipher algorithm is used by both the server and the client to generate session keys. Both sides should end up with the same session keys because they have chosen the same algorithm.
Step 8: After that, the two parties each send a finished message. This message is encrypted using the session keys to validate that the communication between the two has been correctly established.
Finally, the only step left is to send the data encrypted using the session keys. You will then only have the client and the server that will be able to communicate with each other. Anyone listening will only see random bytes being communicated between the two parties.
My nemesis now only has his eyes to weep.
We now can put ourselves in the head of my nemesis. We know how to hack my account and what to do after. We also understand exactly how HTTPS works. In the next article coming soon, we will see how to setup HTTPS using OpenSSL in 5 minutes to stop my nemesis from hacking the users of my website. Finally, we will learn everything my nemesis could still do even though we use HTTPS.