Validate your passwords using Elixir and haveibeenpwned.com's API
Table of Contents
Unless you’ve been living under a rock for the last couple of years, you probably know what two-factor authentication (2FA) is. It’s quite a neat trick actually - you have a password that you have to (obviously) enter correctly (first factor), but you also have to receive a second (random) code through a different medium, sometimes on a different device, that you have to enter to log in (second factor).
Now, obviously this adds quite a bit of overhead to logging in, but it adds a disproportionate value when it comes to security. If you work in any sort of organisation it was probably no surprise when you were asked to turn on 2FA for all your accounts. If you haven’t been asked to (or haven’t done it), it’s time to act ;)
But, what about factor No. 1? The password. Did we give up on them?
Not really. But, for sure we had to become more vigilant and smarter when setting our passwords. Why? Allow me to explain.
Reader, meet haveibeenpwned.com #
Let me introduce you to haveibeenpwned.com. It’s a free resource for anyone to quickly assess if they may have been put at risk due to an online account of their’s having been compromised or “pwned” in a data breach. As you can imagine, to fulfil its purpose, this service also contains quite a long list of pwned passwords (about 500 million of them to be more precise), which are open for querying through a REST API.
If you want to learn more about the project, or it’s author, I suggest checking out the About page of the project.
Using the pwned passwords API #
This API allows us to check if any password is present in haveibeenpwned database. This means that if you send an already pwned password it will tell you that this password has been pwned and that it’s suggested to choose another one.
Imagine you have a website where people can set their passwords, and once the user finished typing their new password you can ping this service and check if the password they chose has been pwned before.
Now, if you are thinking along the lines of “are you telling me to send a plain-text password across the wire to some random API?” then you’re a step ahead, well done!
Sorry to disappoint, but no, actually I am not saying that. Instead of sending the whole password in plain-text, this API only requires the 5 characters of the SHA-1 hash of the actual password.
In Elixir terms, that would look like:
:crypto.hash(:sha, "password")
|> Base.encode16
|> String.slice(0..4)
Interestingly what it sends back is the remainder of the hashed passwords that match the 5 characters that you sent. Basically, this means if we take a SHA-1 of “password”:
iex(1)> :crypto.hash(:sha, "password") |> Base.encode16
"5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8"
We will send only 5BAA6
to the API, while in the response body we will
receive a big list of strings that will represent the rest of the SHA-1, or in
our example that would be 1E4C9B93F3F0682250B6CF8331B7EE68FD8
.
Troy Hunt, who’s the author of haveibeenpwned has written quite an extensive explanation on how this works - you can read it here.
Pinging the API #
For the purpose of this exercise, we will create a small Mix package that will encapsulate all of the behaviours. If you’re not familiar with how to create new packages using Mix, I suggest reading my article Write and publish your first Elixir library.
We will call our package Pwnex
because somehow my brain always thinks that I
have to mix up the main word (pwned) with Elixir to come up with a name.
Anyway, let’s create it:
› mix new pwnex
* creating README.md
* creating .formatter.exs
* creating .gitignore
* creating mix.exs
* creating config
* creating config/config.exs
* creating lib
* creating lib/pwnex.ex
* creating test
* creating test/test_helper.exs
* creating test/pwnex_test.exs
Your Mix project was created successfully.
You can use "mix" to compile it, test it, and more:
cd pwnex
mix test
Run "mix help" for more commands.
Now that we have the package bootstrapped locally, let’s open lib/pwnex.ex
and
add some documentation:
defmodule Pwnex do
@moduledoc """
Consults haveibeenpwned.com's API for pwned passwords.
"""
@doc """
Checks if a given password is already pwned.
## Examples
iex> Pwnex.pwned?("password")
{:pwned, 3_000_000}
iex> Pwnex.pwned?("m4Z2fJJ]r3fxQ*o27")
{:ok, 0}
"""
def pwned?(password) do
end
end
The moduledoc
briefly explains the purpose of the package, while the doc
explains the purpose of the pwned?/1
function and has two examples that we
could use in the doctests.
Our little algorithm #
Let’s see what would be the steps to implement the Pwnex.pwned?/1
function:
def pwned?(password) do
{hash_head, hash_tail} =
password
|> sanitize
|> hash
|> split_password
hash_head
|> fetch_pwns
|> handle_response
|> find_pwns(hash_tail)
|> return_result
end
Once more - the pipeline operator in Elixir makes this function so clear and procedure-like that explaining feels a tad redundant. Still, here it is:
sanitize
- we want to remove all leading and trailing whitespaces from the passwordhash
- we want to convert the password to a SHA1 hash and return it’s first 5 characterssplit_password
- we want to split the head - first 5 characters that we will send to the API and the tail - the rest of the SHA-1 hashfetch_pwns
- we will send an API request to haveibeenpwned to get all (if any) pwns of the passwordhandle_response
- depending on the response we will either get the body, or the reason for failure returnedfind_pwns
- we will take the response body, and because haveibeenpwned uses a k-Anonymity model we will need to find the actual match ourselves (if present)return_result
- will return the tuple which will contain a result atom and a pwns count
Let’s take a step by step approach and implement these functions.
Manipulating the password #
Let’s start easy. In sanitize, we want to trim leading and trailing
whitespaces, while in hash
we want to turn the password to SHA1 and return
it’s first five characters.
def sanitize(password), do: String.trim(password)
There isn’t much to explain here really. Instead of using sanitize
we can use
String.trim/1
, but I prefer to have a separate function that we could extend
and test for any edge cases.
defp hash(password) do
:crypto.hash(:sha, password)
|> Base.encode16
end
:crypto
is an Erlang module that
provides a set of cryptographic functions. Interestingly, it’s not part of the
standard library, but it comes included in the distribution. One of the
functions, as you can see in the code above, is hash/2
, which takes the
hashing algorithm as the first argument and the actual string to be hashed as
the second argument. It returns the binary hash, that we can convert to hex by
using Base.encode16
.
Sending request to an API #
I bet you’re thinking HTTPoison. Aren’t you? While I was writing this article I was also wondering do we have to include to a whole package just to do a simple GET request. You guessed it - we do not.
Although Elixir does not ship an HTTP client, Erlang does. And just like with
:crypto
, you can use Erlang’s HTTP client from Elixir using :httpc
. This
module provides the API to an HTTP/1.1 compatible client. I suggest giving
it’s documentation a quick scan before
we move on.
Let’s let’s open up IEx and give :httpc
a spin:
iex(1)> :httpc.request('https://api.pwnedpasswords.com/range/21FCB')
{:ok,
\{\{'HTTP/1.1', 200, 'OK'},
[
{'cache-control', 'public, max-age=2678400'},
{'connection', 'keep-alive'},
{'date', 'Sat, 22 Dec 2018 11:09:46 GMT'},
{'server', 'cloudflare'},
{'vary', 'Accept-Encoding'},
{'content-length', '19951'},
{'content-type', 'text/plain'},
{'expires', 'Tue, 22 Jan 2019 11:09:46 GMT'},
{'last-modified', 'Thu, 12 Jul 2018 01:32:06 GMT'},
{'set-cookie',
'__cfduid=d51115381191fd7bd0a003d466916efc41545476986; expires=Sun, 22-Dec-19 11:09:46 GMT; path=/; domain=.pwnedpasswords.com; HttpOnly; Secure'},
{'cf-cache-status', 'HIT'},
{'access-control-allow-origin', '*'},
{'arr-disable-session-affinity', 'True'},
{'cf-ray', '48d2235cbe93bf5c-AMS'},
{'expect-ct',
'max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"'},
{'strict-transport-security',
'max-age=31536000; includeSubDomains; preload'},
{'x-content-type-options', 'nosniff'},
{'x-powered-by', 'ASP.NET'}
],
'THEBODYISHERE\r\nOMGSOMUCHSTUFFHEREWHATISTHISEVEN' ++ ...}}
You see, although the output is pretty verbose if you have ever sent an HTTP
request via cURL or you’ve opened your browser’s debugging tools, there
shouldn’t be any surprises here. The request/1
function will send a request
to the pwnedpasswords API and it will return a ton of nested tuples, most of
them being the response headers and the raw body of the response.
For our purpose, we can keep it simple. We are only interested if the function
will return :ok
atom as the first item in the tuple, or :error
. We can use
pattern matching to do this:
iex(1)> {:ok, {_status, _headers, body }} =
:httpc.request('https://api.pwnedpasswords.com/range/21FCB')
So, lets get back to our Pwnex.fetch_pwns/1
function. The function will
receive the first 5 characters of the hashed password, it will send that to the
API and will return the body
of the response:
def fetch_pwns(head) do
:httpc.request('https://api.pwnedpasswords.com/range/#{head}')
end
Handling the response #
The handle_response
will actually be one function with three bodies:
def handle_response({:ok, {_status, _headers, body}}), do: body
def handle_response({:error, {reason, _meta}}), do: reason
def handle_response(_), do: nil
By using pattern matching we can have three types of function bodies. The first one will be invoked when the response status is HTTP 200 OK, the second one when there’s an error and the third one for any other case.
As you can imagine, we could have used conditional logic here, but having the power of pattern matching allows us to have three tiny functions that are very easy to read, understand and test.
Parsing the response #
Here’s the body of the response:
003D68EB55068C33ACE09247EE4C639306B:3\r\n012C192B2F16F82EA0EB9EF18D9D539B0DD:1\r\n01330C689E5D64F660D6947A93AD634EF8F:1\r\n0198748F3315F40B1A102BF18EEA0194CD9:1\r\n01F9033B3C00C65DBFD6D1DC4D22918F5E9:2\r\n0424DB98C7A0846D2C6C75E697092A0CC3E:5\r\n047F229A81EE2747253F9897DA38946E241:1\r\n04A37A676E312CC7C4D236C93FBD992AA3C:5\r\n04AE045B134BDC43043B216AEF66100EE00:2\r\n0502EA98ED7A1000D932B10F7707D37FFB4:5\r\n0539F86F519AACC7030B728CD47803E5B22:5\r\n054A0BD53E2BC83A87EFDC236E2D0498C08:3\r\n05AA835DC9423327DAEC1CBD38FA99B8834:1\r\n05E0182DEAE22D02F6ED35280BCAC370179:4
If you look carefully you’ll notice that it’s actually a list of partial SHA-1
hashes separated by \r\n
. With a closer inspection of the first one:
003D68EB55068C33ACE09247EE4C639306B:3
you notice that it’s actually a part of the hash, a colon :
and a number (3
in the example above). This is actually the hash without it’s first 5
characters and the number of times that particular password has been pwned.
This means that the password who’s SHA-1 hash is
5BAA6003D68EB55068C33ACE09247EE4C639306B
has been pwned 3 times, according to
haveibeenpwned.com.
What we need to with this response body is to split it, to convert it to a list which we will iterate and find our matching hash in. Let’s do that:
def find_pwns(response, hash_tail) do
response
|> to_string
|> String.split()
|> Enum.find(&(String.starts_with?(&1, hash_tail)))
end
Although find_pwns/2
might look a bit loaded, let me assure you it’s not.
Let’s see what each of the lines do here:
to_string
will convert the character list we receive fromfetch_pwns
and will convert it to a string so we can parse it in the next stepsString.split
will split the string on the\r\n
characters and will create a list of strings, looking like:["003D68EB55068C33ACE09247EE4C639306B:3", ...]
- We will invoke
Enum.find
which takes the list and a function as arguments. The list is the parsed list of hash tails and their pwns count, while the function isString.starts_with?/2
, which will returntrue
when a line starts with the value ofhash_tail
.
That’s all. At the end, the find_pwns/2
function will return either the line
that contains the matched hash tail or it will return nil
.
Returning a meaningful result #
Now that we have found the count of pwns for the hash (or just a nil
), we
need to handle that and return a meaningful tuple to the user of the module.
When the find_pwns
function does find a count, we want to return a tuple like
{:pwned, count}
. Otherwise, when find_pwns
does not find a count it will
return nil
, which we handle in the second definition of the return_result
function:
def return_result(line) when is_binary(line) do
[_, count] = String.split(line, ":")
{:pwned, count}
end
def return_result(_), do: {:ok, 0}
In the first function body we will take the line
, which should be a binary
(string), split it at the :
character and then return the tuple with the
count. In the second function body we take any argument (which in our case is
nil
) and return a tuple with 0
as the count.
Using Pwnex #
Now, let’s load Pwnex in IEx and give it a spin. To load it in IEx, you need to
open the root of the module and run iex -S mix
. This will open a IEx session
and execute mix
in it, which will in fact, load and compile the module and
make it available for invocation directly from IEx:
› iex -S mix
Erlang/OTP 21 [erts-10.2] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe] [dtrace]
Interactive Elixir (1.7.4) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> h Pwnex
Pwnex
Consults haveibeenpwned.com's API for pwned passwords.
iex(2)> Pwnex.pwned?("password")
{:pwned, 3533661}
iex(3)> Pwnex.pwned?("123!@#asd*&(*123SAkjhda")
{:ok, 0}
As you can see, password
has been pwned about 3.5 million times, while
123!@#asd*&(*123SAkjhda
never.
That’s basically it. We have Pwnex working - it takes our input as a function argument, talks to an API through an Erlang HTTP client, parses its response body, builds a map of hashes and finds any pwns for the given password.
As you can see, this whole package does quite a bit in 65 lines of code.
In this article, we saw how we can create a new package, use it to communicate with an API over HTTP, we learned why you don’t always need HTTPoison, how you can parse a request body and how you can mingle with some data.
If you would like to see the actual code that we wrote in this article, head over to its repo on Github.