OTP in Elixir: Learn GenServer by Building Your Own URL Shortener
Table of Contents
Looking at any programming language you will (hopefully!) find a rich and
useful standard library. I started my professional career as a software
developer with Ruby, which has quite an easy-to-use and well-documented
standard library with a plethora of modules and classes to use. Personally, I
find the Enumerable
module in Ruby with all its nice methods simply
brilliant.
You might be coming from a different language, but I am sure that any serious programming language out there has a set of classes, modules, methods and functions to make your life (and work) easy.
So, what about Elixirs stdlib?
No surprise there too – Elixir has a well-documented and easy-to-use standard library. But, because it works on top of the BEAM virtual machine and inherits plenty from Erlang’s rich history, it also has a bit more – something called OTP.
Photo by Mario Caruso on Unsplash
Meet OTP 👋 #
From Wikipedia’s article on OTP:
OTP is a collection of useful middleware, libraries, and tools written in the Erlang programming language. It is an integral part of the open-source distribution of Erlang. The name OTP was originally an acronym for Open Telecom Platform, which was a branding attempt before Ericsson released Erlang/OTP as open source. However, neither Erlang nor OTP is specific to telecom applications.
In continuation, it states:
It (OTP) contains:
- an Erlang interpreter (called BEAM);
- an Erlang compiler;
- a protocol for communication between servers (nodes);
- a CORBA Object Request Broker;
- a static analysis tool called Dialyzer;
- a distributed database server (Mnesia); and
- many other libraries.
While I do not consider myself an expert in Elixir, Erlang, the BEAM or OTP by
any stretch of the imagination, I would like to take you on a journey to one of
the most useful and well-known behaviours of OTP – GenServer
.
In continuation, we will use BEAM processes, so if you’re not familiar with spawning new processes, sending and receiving messages to/from them, then it’s best to head over to “Understanding the basics of Elixir’s concurrency model” and give it a quick read. It will help you understand processes and concurrency in Elixir so you can apply the knowledge in the project that we work on in this article. I promise.
Shortening a link ✂️ #
Let’s write a URL shortener module that will run in a BEAM process and can receive multiple commands:
shorten
– takes a link, shortens it and returns the short link as a responseget
– take a short link and return the original oneflush
– erase the URL shortener memorystop
– stop the process
defmodule URLShortener do
def start do
spawn(__MODULE__, :loop, [%{}])
end
def loop(state) do
receive do
{:stop, caller} ->
send caller, "Shutting down."
{:shorten, url, caller} ->
url_md5 = md5(url)
new_state = Map.put(state, url_md5, url)
send caller, url_md5
loop(new_state)
{:get, md5, caller} ->
send caller, Map.fetch(state, md5)
loop(state)
:flush ->
loop(%{})
_ ->
loop(state)
end
end
defp md5(url) do
:crypto.hash(:md5, url)
|> Base.encode16(case: :lower)
end
end
What the module does is when the process starts it will recursively call the
URLShortener.loop/1
function, until it receives the {:stop, caller}
message.
If we zoom into the {:shorten, url, caller}
case we notice that we generate a
MD5
digest from the URL and then we update the state
map which creates a
new map (called new_state
). Once we get the digest we store it in a map with
the key being the MD5 and the value is the actual URL. The state
map will
look like:
%{
"99999ebcfdb78df077ad2727fd00969f" => "https://google.com",
"76100d6f27db53fddb6c8fce320f5d21" => "https://elixir-lang.org",
"3097fca9b1ec8942c4305e550ef1b50a" => "https://github.com",
...
}
Then, we send the MD5 value back to the caller. Obviously, this is not how bit.ly or the likes work, as their links are much shorter. (For those interested, here’s an interesting discussion on the topic). However, for the purpose of this article, we’ll stick to simple MD5 digest of the URL.
The other two commands, get
and flush
, are pretty simple. get
returns
only a single value from the state
map, while flush
invokes loop/1
with
an empty map, effectively removing all the shortened links from the process'
state (memory).
Let’s run our shortener in an IEx
session:
iex(22)> shortener = URLShortener.start
#PID<0.141.0>
iex(23)> send shortener, {:shorten, "https://ieftimov.com", self()}
{:shorten, "https://ieftimov.com", #PID<0.102.0>}
iex(24)> send shortener, {:shorten, "https://google.com", self()}
{:shorten, "https://google.com", #PID<0.102.0>}
iex(25)> send shortener, {:shorten, "https://github.com", self()}
{:shorten, "https://github.com", #PID<0.102.0>}
iex(26)> flush
"8c4c7fbc57b08d379da5b1312690be04"
"99999ebcfdb78df077ad2727fd00969f"
"3097fca9b1ec8942c4305e550ef1b50a"
:ok
iex(27)> send shortener, {:get, "99999ebcfdb78df077ad2727fd00969f", self()}
{:get, "99999ebcfdb78df077ad2727fd00969f", #PID<0.102.0>}
iex(28)> flush
"https://google.com"
:ok
iex(29)> send shortener, {:get, "8c4c7fbc57b08d379da5b1312690be04", self()}
{:get, "8c4c7fbc57b08d379da5b1312690be04", #PID<0.102.0>}
iex(30)> flush
"https://ieftimov.com"
:ok
iex(31)> send shortener, {:get, "3097fca9b1ec8942c4305e550ef1b50a", self()}
{:get, "3097fca9b1ec8942c4305e550ef1b50a", #PID<0.102.0>}
iex(32)> flush
"https://github.com"
:ok
Working as expected – we send three different URLs for shortening, we receive their MD5 digests back in the process mailbox and when we query for them we get each of them back.
Although our URLShortener
module works pretty neatly now, it actually lacks
quite a bit of functionality. Sure, it does handle the happy path really well,
but when it comes to error handling, tracing or error reporting it falls really
short. Additionally, it does not have a standard interface to add more
functions to the process – we sort of came up with it as we went on.
After reading all of that you’re probably thinking there is a better way to do
this. And you’d be right to think so – let’s learn more about GenServer
s.
Enter GenServer 🚪 #
GenServer
is an OTP behaviour. Behaviour in this context refers to three
things:
- an interface, which is a set of functions;
- an implementation, which is the application-specific code, and
- the container, which is a BEAM process
This means that a module can implement a certain group of functions (interface or signatures), that under the hood implement some callback functions (which are specific to the behaviour you work on), that are run within a BEAM process.
For example, GenServer
is a generic server behaviour – it expects for
each of the functions defined in it’s interface a set of callbacks which will
handle the requests to the server. This means that the interface functions will
be used by the clients of the generic server, a.k.a. the client API, while the
callbacks defined will essentially be the server internals (“the backend”).
So, how does a GenServer
work? Well, as you can imagine we cannot go too deep
on the hows of GenServer
, but we need to get a good grasp on some basics:
- Server start & state
- Asynchronous messages
- Synchronous messages
Server start & state #
Just like with our URLShortener
we implemented, every GenServer
is capable
of holding state. In fact, GenServer
s must implement a init/1
function,
which will set the initial state of the server (see the init/1
documentation
here for more details).
To start the server we can run:
GenServer.start_link(__MODULE__, :ok, [])
GenServer.start_link/3
will invoke the init/1
function of the __MODULE__
,
passing in :ok
as an argument to init/1
. This function call will block
until init/1
returns, so usually in this function, we do any required setup
of the server process (that might be needed). For example, in our case, to
rebuild URLShortener
using a GenServer
behaviour, we will need an init/1
function to set the initial state (empty map) of the server:
def init(:ok) do
{:ok, %{}}
end
That’s all. start_link/3
will call init/1
with the :ok
argument, which
will return an :ok
and set the state of the process to an empty map.
Sync & async messages 📨 #
As most servers out there, GenServer
s can also receive and reply to requests
(if needed). As the heading suggests, there are two types of requests that
GenServers
handle – the ones expect a response (call
) and the others that
don’t (cast
). Therefore, GenServer
s define two callback functions -
handle_call/3
and handle_cast/2
.
We will look at these functions in more depth a bit later.
Reimplementing URLShortener
, using GenServer
♻️ #
Let’s look at how we can flip the implementation to use GenServer
.
First, let’s add the shell of the module, the start_link/1
function and the
init/1
function that start_link/1
will invoke:
defmodule URLShortener do
use GenServer
def start_link(opts \\ []) do
GenServer.start_link(__MODULE__, :ok, opts)
end
def init(:ok) do
{:ok, %{}}
end
end
The notable changes here are the use
of the GenServer
behaviour in the
module, the start_link/1
function which invokes GenServer.start_link/3
which
would, in fact, call the init/1
function with the :ok
atom as an argument.
Also, it’s worth noting that the empty map that the init/1
function returns in
the tuple is the actual initial state of the URLShortener
process.
Let’s give it a spin in IEx
:
iex(1)> {:ok, pid} = URLShortener.start_link
{:ok, #PID<0.108.0>}
That’s all we can do at this moment. The difference here is that the
GenServer.start_link/3
function will return a tuple with an atom (:ok
) and
the PID of the server.
Stopping the server ✋ #
Let’s add the stop
command:
defmodule URLShortener do
use GenServer
# Client API
def start_link(opts \\ []), do: GenServer.start_link(__MODULE__, :ok, opts)
def stop(pid) do
GenServer.cast(pid, :stop)
end
# GenServer callbacks
def init(:ok), do: {:ok, %{}}
def handle_cast(:stop, state) do
{:stop, :normal, state}
end
end
Yes, I know I said we’ll add one command but ended up adding two functions:
stop/1
and handle_cast/2
. Bear with me now:
Because we do not want to get a response back on the stop
command, we will use
GenServer.cast/2
in the stop/1
function. This means that when that command
is called by the client (user) of the server, the handle_cast/2
callback will
be triggered on the server. In our case, the handle_cast/2
function will return
a tuple of three items – {:stop, :normal, state}
.
Returning this tuple stops the loop and another callback called terminate/2
is
called (which is defined in the behaviour but not implemented by URLShortener
)
with the reason :normal
and state state
. The process will exit with reason
:normal
.
This way of working with GenServer
allows us to only define callbacks and
the GenServer
behaviour will know how to handle the rest. The only complexity
resides in the fact that we need to understand and know most types of returns
that the callback functions can have.
Another thing worth pointing out is that each function that will be used by
the client will take a PID
as a first argument. This will allow us to send
messages to the correct GenServer
process. Going forward we will not
acknowledge PID
s presence – we accept that it’s mandatory for our
URLShortener
to work. Later we will look at ways we can skip passing the
PID
s as arguments.
Let’s jump back in IEx
and start and stop a URLShortener
server:
iex(1)> {:ok, pid} = URLShortener.start_link
{:ok, #PID<0.109.0>}
iex(2)> Process.alive?(pid)
true
iex(3)> URLShortener.stop(pid)
:ok
iex(4)> Process.alive?(pid)
false
That’s starting and stopping in all of it’s glory.
Shortening a URL #
Another thing we wanted our server to have is the ability to shorten URLs, by
using their MD5 digest as the short variant of the URL. Let’s do that using
GenServer
:
defmodule URLShortener do
use GenServer
# Client API
def start_link(opts \\ []), do: GenServer.start_link(__MODULE__, :ok, opts)
def stop(pid), do: GenServer.cast(pid, :stop)
def shorten(pid, url) do
GenServer.call(pid, {:shorten, url})
end
# GenServer callbacks
def init(:ok), do: {:ok, %{}}
def handle_cast(:stop, state), do: {:stop, :normal, state}
def handle_call({:shorten, url}, _from, state) do
short = md5(url)
{:reply, short, Map.put(state, short, url)}
end
defp md5(url) do
:crypto.hash(:md5, url)
|> Base.encode16(case: :lower)
end
end
Three functions this time, but at least the md5/1
is a replica of the one we
had previously. So, let’s look at the other two.
You might be seeing a pattern - we have a function that will be used by the
client (shorten/2
) and a callback that will be invoked on the server
(handle_call/3
). This time, there’s a slight difference in the functions used
and naming: in shorten/2
we call GenServer.call/2
instead of cast/2
, and
the callback name is handle_call/3
instead of handle_cast/2
.
Why? Well, the difference lies in the response - handle_call/3
will send a
reply back to the client (hence the :reply
atom in the response tuple), while
handle_cast/2
does not do that. Basically cast
ing is an async call where the
client does not expect a response, while call
ing is a sync call where the
response is expected.
So, let’s look at the structure of the handle_call/3
callback.
It takes three arguments: the request from the client (in our case a tuple), a tuple describing the client of the request (which we ignore), and the state of the server (in our case a map).
As a response, it returns a tuple with :reply
, stating that there will be a
reply to the request, the reply itself (in our case the short
ened link) and
the state
which is the state carried over to the next loop of the server.
Of course, handle_call/3
has a bit more intricacies that we will look into
later, but you can always check it’s
documentation to
learn more.
Fetching a shortened URL 🔗 #
Let’s implement the get
command, which when provided with a short
version of
the link it will return the full URL:
defmodule URLShortener do
use GenServer
# Client API
# ...
def get(pid, short_link) do
GenServer.call(pid, {:get, short_link})
end
# GenServer callbacks
# ...
def handle_call({:get, short_link}, _from, state) do
{:reply, Map.get(state, short_link), state}
end
end
The double-function entry pattern again - we add URLShortener.get/2
and
another head of the URLShortener.handle_call/3
function.
The URLShortener.get/2
will call GenServer.call/2
under the hood, which when
executed will cause the handle_call/3
callback to fire.
The URLShortener.handle_call/3
this time will take the command (:get
) and
the short_link
as the first argument. Looking inside we see that, again, it’s
a short function - it only returns a tuple with :reply
(which states that the
call will have a reply), a call to Map.get/2
, whose return will be the actual
response of the call, and the state
, so the GenServer
process maintains the
state in the next loop.
At this moment, we can safely say that we have a good idea of the basics on
writing functionality for a module that implements the GenServer
behaviour.
As you might be thinking, there’s much to explore, but these basics will allow
you to create GenServer
s and experiment.
Before you go on, try to implement two more commands:
flush
– an async call which will erase the state of the servercount
– a sync call returning the number of links in the server’s state
More configurations 🎛 #
If we circle back to URLShortener.start_link/1
and it’s internals (namely the
invocation of GenServer.start_link/3
), we will also notice that we can pass
options (opts
) to the GenServer.start_link/3
function, that are defaulted to
an empty list ([]
).
What are the options that we can add here? By looking at
the documentation of
GenServer.start_link/3
you’ll notice multiple interesting options:
:name
- used for name registration. This means that instead of identifying aGenServer
byPID
, we can put a name on it.:timeout
- sets the server startup timeout (in milliseconds):debug
- enables debugging by invoking the corresponding function in the:sys
module:hibernate_after
- sets the time (in milliseconds) after which the server process will go into hibernation automatically until a new request comes in. This is done by utilising:proc_lib.hibernate/3
:spawn_opt
- enables passing more options to the underlying process
Most of these are advanced and are beyond our use-case here. However, there’s
one configuration we could use right now: :name
.
Naming the server 📢 #
Let’s modify our URLShortener
to take a name
in it’s start_link/1
function
and test it in IEx
. Additionally, since every URLShortener
process will have
a name, we can refer to the process by name instead of PID
- let’s see how that
would work in code:
defmodule URLShortener do
use GenServer
# Client API
def start_link(name, opts \\ []) do
GenServer.start_link(__MODULE__, :ok, opts ++ [name: name])
end
def stop(name), do: GenServer.cast(name, :stop)
def shorten(name, url), do: GenServer.call(name, {:shorten, url})
def get(name, short_link) do
GenServer.call(name, {:get, short_link})
end
# GenServer callbacks
def init(:ok), do: {:ok, %{}}
def handle_cast(:stop, state), do: {:stop, :normal, state}
def handle_call({:shorten, url}, _from, state), do: {:reply, md5(url), Map.put(state, md5(url), url)}
def handle_call({:get, short_link}, _from, state) do
{:reply, Map.get(state, short_link), state}
end
defp md5(url), do: :crypto.hash(:md5, url) |> Base.encode16(case: :lower)
end
That’s all. We added a new argument to URLShortener.start_link/2
and we dropped
all usage of PID
and replaced it with name
.
Let’s take it for a spin in IEx
:
iex(1)> {:ok, pid} = URLShortener.start_link(:foo)
{:ok, #PID<0.109.0>}
iex(2)> URLShortener.shorten(:foo, "https://google.com")
"99999ebcfdb78df077ad2727fd00969f"
iex(3)> URLShortener.get(:foo, "99999ebcfdb78df077ad2727fd00969f")
"https://google.com"
iex(4)> URLShortener.stop(:foo)
:ok
iex(5)> Process.alive?(pid)
false
You can see that this is pretty cool - instead of using PID
we added a name
:foo
to the process which allowed us to refer to it using the name instead of
the PID
. Obviously, you can see that to inspect the BEAM process in any
fashion we will still need the PID
, but for the client the name
does the
trick.
This combination of name and PID
allows us to have reference to the BEAM
process while improving the ease of use for the client.
If we would like to simplify things even more, we can turn the URLShortener
into a “singleton” server. Before you freak out - it has none of the drawbacks
that the singleton pattern that’s infamous in OO
programming has. We’re merely stating that we could change the URLShortener
to have one and only one process running at a certain time, by setting a static
name to it:
defmodule URLShortener do
use GenServer
@name :url_shortener_server
# Client API
def start_link(opts \\ []) do
GenServer.start_link(__MODULE__, :ok, opts ++ [name: @name])
end
def stop, do: GenServer.cast(@name, :stop)
def shorten(url), do: GenServer.call(@name, {:shorten, url})
def get(short_link) do
GenServer.call(@name, {:get, short_link})
end
# GenServer callbacks
# ...
end
You can notice that we added a module attribute @name
that holds the name of
the process. In all the functions from the client API, we dropped the name
from the arguments lists and we simply use @name
as a reference to the
process. This means that there’s going to be only one process for
URLShortener
with the name :url_shortener_server
.
Let’s take it for a spin in IEx
:
iex(1)> {:ok, pid} = URLShortener.start_link
{:ok, #PID<0.108.0>}
iex(2)> URLShortener.shorten("https://google.com")
"99999ebcfdb78df077ad2727fd00969f"
iex(3)> URLShortener.shorten("https://yahoo.com")
"c88f320dec138ba5ab0a5f990ff082ba"
iex(4)> URLShortener.get("99999ebcfdb78df077ad2727fd00969f")
"https://google.com"
iex(5)> URLShortener.stop
:ok
iex(6)> Process.alive?(pid)
false
You can notice that although we captured the PID
on the first line, we do not
need it at all - all of the work is done for us by URLShortener
.
In this section, you saw how you can utilise names to more easily work with
processes. Let’s review our full implementation of the URLShortener
module.
Outro 🖖 #
Before we wrap up this long tutorial, let’s have a final look at our new
URLShortener
module, including the count/1
and flush/1
functions:
defmodule URLShortener do
use GenServer
# Client API
def start_link(name, opts \\ []) do
GenServer.start_link(__MODULE__, :ok, opts ++ [name: name])
end
def shorten(name, url) do
GenServer.call(name, {:shorten, url})
end
def get(name, short) do
GenServer.call(name, {:get, short})
end
def flush(name) do
GenServer.cast(name, :flush)
end
def stop(name) do
GenServer.cast(name, :stop)
end
def count(name) do
GenServer.call(name, :count)
end
# Callbacks
def init(:ok) do
{:ok, %{}}
end
def handle_cast(:flush, _state) do
{:noreply, %{}}
end
def handle_cast(:stop, state) do
{:stop, :normal, state}
end
def handle_call({:shorten, url}, _from, state) do
shortened = md5(url)
new_state = Map.put(state, shortened, url)
{:reply, shortened, new_state}
end
def handle_call({:get, short}, _from, state) do
{:reply, Map.get(state, short), state}
end
def handle_call(:count, _from, state) do
count = Map.keys(state) |> Enum.count
{:reply, count, state}
end
defp md5(url) do
:crypto.hash(:md5, url)
|> Base.encode16(case: :lower)
end
end
The two callbacks are faily simple - flush
will just send a noreply
and set
the state to an empty map. count
on the other hand will have a reply
with
the count of the items of the map, which is simply the numbers of keys there
are in the state
map. That’s all.
While you got to the end of the article your journey with GenServer
does not
end here. In fact, it just started. GenServer
s and OTP are very powerful
tools that you can use to build generic servers that can live in small BEAM
processes and have a very generic approach to building functionality (calls and
callbacks).
While we did cover a lot of ground here we didn’t touch on why we named the
starting function start_link
instead of just start
(hint: supervisors
convention), or how we would approach testing such a GenServer
like
URLShortener
.
In what kind of scenarios have you used GenServers
? Or, if you do not have
experience with it, where do you see yourself using it in the future?