OTP in Elixir: Learn GenServer by Building Your Own URL Shortener

Looking at any programming language you will (hopefully!) find a rich and useful standard library. I started my professional career as a software developer with Ruby, which has quite an easy-to-use and well-documented standard library with a plethora of modules and classes to use. Personally, I find the Enumerable module in Ruby with all its nice methods simply brilliant.

You might be coming from a different language, but I am sure that any serious programming language out there has a set of classes, modules, methods and functions to make your life (and work) easy.

So, what about Elixirs stdlib?

No surprise there too – Elixir has a well-documented and easy-to-use standard library. But, because it works on top of the BEAM virtual machine and inherits plenty from Erlang’s rich history, it also has a bit more – something called OTP.

Photo by Mario Caruso on Unsplash

Meet OTP 👋

From Wikipedia’s article on OTP:

OTP is a collection of useful middleware, libraries, and tools written in the Erlang programming language. It is an integral part of the open-source distribution of Erlang. The name OTP was originally an acronym for Open Telecom Platform, which was a branding attempt before Ericsson released Erlang/OTP as open source. However, neither Erlang nor OTP is specific to telecom applications.

In continuation, it states:

It (OTP) contains:

  • an Erlang interpreter (called BEAM);
  • an Erlang compiler;
  • a protocol for communication between servers (nodes);
  • a CORBA Object Request Broker;
  • a static analysis tool called Dialyzer;
  • a distributed database server (Mnesia); and
  • many other libraries.

While I do not consider myself an expert in Elixir, Erlang, the BEAM or OTP by any stretch of the imagination, I would like to take you on a journey to one of the most useful and well-known behaviours of OTP – GenServer.

I am an OTP expert as much as I can find my way around here 👆 – I know how to open the door. Photo by Jon Flobrant on Unsplash

In continuation, we will use BEAM processes, so if you’re not familiar with spawning new processes, sending and receiving messages to/from them, then it’s best to head over to “Understanding the basics of Elixir’s concurrency model” and give it a quick read. It will help you understand processes and concurrency in Elixir so you can apply the knowledge in the project that we work on in this article. I promise.

Let’s write a URL shortener module that will run in a BEAM process and can receive multiple commands:

  • shorten – takes a link, shortens it and returns the short link as a response
  • get – take a short link and return the original one
  • flush – erase the URL shortener memory
  • stop – stop the process
defmodule URLShortener do
  def start do
    spawn(__MODULE__, :loop, [%{}])
  end

  def loop(state) do
    receive do
      {:stop, caller} ->
        send caller, "Shutting down."
      {:shorten, url, caller} ->
        url_md5 = md5(url)
        new_state = Map.put(state, url_md5, url)
        send caller, url_md5
        loop(new_state)
      {:get, md5, caller} ->
        send caller, Map.fetch(state, md5)
        loop(state)
      :flush ->
        loop(%{})
      _ ->
        loop(state)
    end
  end

  defp md5(url) do
    :crypto.hash(:md5, url)
    |> Base.encode16(case: :lower)
  end
end

What the module does is when the process starts it will recursively call the URLShortener.loop/1 function, until it receives the {:stop, caller} message.

If we zoom into the {:shorten, url, caller} case we notice that we generate a MD5 digest from the URL and then we update the state map which creates a new map (called new_state). Once we get the digest we store it in a map with the key being the MD5 and the value is the actual URL. The state map will look like:

%{
  "99999ebcfdb78df077ad2727fd00969f" => "https://google.com",
  "76100d6f27db53fddb6c8fce320f5d21" => "https://elixir-lang.org",
  "3097fca9b1ec8942c4305e550ef1b50a" => "https://github.com",
  ...
}

Then, we send the MD5 value back to the caller. Obviously, this is not how bit.ly or the likes work, as their links are much shorter. (For those interested, here’s an interesting discussion on the topic). However, for the purpose of this article, we’ll stick to simple MD5 digest of the URL.

The other two commands, get and flush, are pretty simple. get returns only a single value from the state map, while flush invokes loop/1 with an empty map, effectively removing all the shortened links from the process’ state (memory).

Let’s run our shortener in an IEx session:

iex(22)> shortener = URLShortener.start
#PID<0.141.0>

iex(23)> send shortener, {:shorten, "https://ieftimov.com", self()}
{:shorten, "https://ieftimov.com", #PID<0.102.0>}

iex(24)> send shortener, {:shorten, "https://google.com", self()}
{:shorten, "https://google.com", #PID<0.102.0>}

iex(25)> send shortener, {:shorten, "https://github.com", self()}
{:shorten, "https://github.com", #PID<0.102.0>}

iex(26)> flush
"8c4c7fbc57b08d379da5b1312690be04"
"99999ebcfdb78df077ad2727fd00969f"
"3097fca9b1ec8942c4305e550ef1b50a"
:ok

iex(27)> send shortener, {:get, "99999ebcfdb78df077ad2727fd00969f", self()}
{:get, "99999ebcfdb78df077ad2727fd00969f", #PID<0.102.0>}

iex(28)> flush
"https://google.com"
:ok

iex(29)> send shortener, {:get, "8c4c7fbc57b08d379da5b1312690be04", self()}
{:get, "8c4c7fbc57b08d379da5b1312690be04", #PID<0.102.0>}

iex(30)> flush
"https://ieftimov.com"
:ok

iex(31)> send shortener, {:get, "3097fca9b1ec8942c4305e550ef1b50a", self()}
{:get, "3097fca9b1ec8942c4305e550ef1b50a", #PID<0.102.0>}

iex(32)> flush
"https://github.com"
:ok

Working as expected – we send three different URLs for shortening, we receive their MD5 digests back in the process mailbox and when we query for them we get each of them back.

Although our URLShortener module works pretty neatly now, it actually lacks quite a bit of functionality. Sure, it does handle the happy path really well, but when it comes to error handling, tracing or error reporting it falls really short. Additionally, it does not have a standard interface to add more functions to the process – we sort of came up with it as we went on.

After reading all of that you’re probably thinking there is a better way to do this. And you’d be right to think so – let’s learn more about GenServers.

Enter GenServer 🚪

GenServer is an OTP behaviour. Behaviour in this context refers to three things:

  • an interface, which is a set of functions;
  • an implementation, which is the application-specific code, and
  • the container, which is a BEAM process

This means that a module can implement a certain group of functions (interface or signatures), that under the hood implement some callback functions (which are specific to the behaviour you work on), that are run within a BEAM process.

For example, GenServer is a generic server behaviour – it expects for each of the functions defined in it’s interface a set of callbacks which will handle the requests to the server. This means that the interface functions will be used by the clients of the generic server, a.k.a. the client API, while the callbacks defined will essentially be the server internals (“the backend”).

So, how does a GenServer work? Well, as you can imagine we cannot go too deep on the hows of GenServer, but we need to get a good grasp on some basics:

  1. Server start & state
  2. Asynchronous messages
  3. Synchronous messages

Server start & state

Just like with our URLShortener we implemented, every GenServer is capable of holding state. In fact, GenServers must implement a init/1 function, which will set the initial state of the server (see the init/1 documentation here for more details).

To start the server we can run:

GenServer.start_link(__MODULE__, :ok, [])

GenServer.start_link/3 will invoke the init/1 function of the __MODULE__, passing in :ok as an argument to init/1. This function call will block until init/1 returns, so usually in this function, we do any required setup of the server process (that might be needed). For example, in our case, to rebuild URLShortener using a GenServer behaviour, we will need an init/1 function to set the initial state (empty map) of the server:

def init(:ok) do
  {:ok, %{}}
end

That’s all. start_link/3 will call init/1 with the :ok argument, which will return an :ok and set the state of the process to an empty map.

Sync & async messages 📨

As most servers out there, GenServers can also receive and reply to requests (if needed). As the heading suggests, there are two types of requests that GenServers handle – the ones expect a response (call) and the others that don’t (cast). Therefore, GenServers define two callback functions - handle_call/3 and handle_cast/2.

We will look at these functions in more depth a bit later.

Reimplementing URLShortener, using GenServer ♻️

Let’s look at how we can flip the implementation to use GenServer.

First, let’s add the shell of the module, the start_link/1 function and the init/1 function that start_link/1 will invoke:

defmodule URLShortener do
  use GenServer

  def start_link(opts \\ []) do
    GenServer.start_link(__MODULE__, :ok, opts)
  end

  def init(:ok) do
    {:ok, %{}}
  end
end

The notable changes here are the use of the GenServer behaviour in the module, the start_link/1 function which invokes GenServer.start_link/3 which would, in fact, call the init/1 function with the :ok atom as an argument. Also, it’s worth noting that the empty map that the init/1 function returns in the tuple is the actual initial state of the URLShortener process.

Let’s give it a spin in IEx:

iex(1)> {:ok, pid} = URLShortener.start_link
{:ok, #PID<0.108.0>}

That’s all we can do at this moment. The difference here is that the GenServer.start_link/3 function will return a tuple with an atom (:ok) and the PID of the server.

Stopping the server ✋

Let’s add the stop command:

defmodule URLShortener do
  use GenServer

  # Client API
  def start_link(opts \\ []), do: GenServer.start_link(__MODULE__, :ok, opts)

  def stop(pid) do
    GenServer.cast(pid, :stop)
  end

  # GenServer callbacks
  def init(:ok), do: {:ok, %{}}

  def handle_cast(:stop, state) do
    {:stop, :normal, state}
  end
end

Yes, I know I said we’ll add one command but ended up adding two functions: stop/1 and handle_cast/2. Bear with me now:

Because we do not want to get a response back on the stop command, we will use GenServer.cast/2 in the stop/1 function. This means that when that command is called by the client (user) of the server, the handle_cast/2 callback will be triggered on the server. In our case, the handle_cast/2 function will return a tuple of three items – {:stop, :normal, state}.

Returning this tuple stops the loop and another callback called terminate/2 is called (which is defined in the behaviour but not implemented by URLShortener) with the reason :normal and state state. The process will exit with reason :normal.

This way of working with GenServer allows us to only define callbacks and the GenServer behaviour will know how to handle the rest. The only complexity resides in the fact that we need to understand and know most types of returns that the callback functions can have.

Another thing worth pointing out is that each function that will be used by the client will take a PID as a first argument. This will allow us to send messages to the correct GenServer process. Going forward we will not acknowledge PIDs presence – we accept that it’s mandatory for our URLShortener to work. Later we will look at ways we can skip passing the PIDs as arguments.

Let’s jump back in IEx and start and stop a URLShortener server:

iex(1)> {:ok, pid} = URLShortener.start_link
{:ok, #PID<0.109.0>}

iex(2)> Process.alive?(pid)
true

iex(3)> URLShortener.stop(pid)
:ok

iex(4)> Process.alive?(pid)
false

That’s starting and stopping in all of it’s glory.

Shortening a URL

Another thing we wanted our server to have is the ability to shorten URLs, by using their MD5 digest as the short variant of the URL. Let’s do that using GenServer:

defmodule URLShortener do
  use GenServer

  # Client API
  def start_link(opts \\ []), do: GenServer.start_link(__MODULE__, :ok, opts)
  def stop(pid), do: GenServer.cast(pid, :stop)

  def shorten(pid, url) do
    GenServer.call(pid, {:shorten, url})
  end

  # GenServer callbacks
  def init(:ok), do: {:ok, %{}}
  def handle_cast(:stop, state), do: {:stop, :normal, state}

  def handle_call({:shorten, url}, _from, state) do
    short = md5(url)
    {:reply, short, Map.put(state, short, url)}
  end

  defp md5(url) do
    :crypto.hash(:md5, url)
    |> Base.encode16(case: :lower)
  end
end

Three functions this time, but at least the md5/1 is a replica of the one we had previously. So, let’s look at the other two.

You might be seeing a pattern - we have a function that will be used by the client (shorten/2) and a callback that will be invoked on the server (handle_call/3). This time, there’s a slight difference in the functions used and naming: in shorten/2 we call GenServer.call/2 instead of cast/2, and the callback name is handle_call/3 instead of handle_cast/2.

Why? Well, the difference lies in the response - handle_call/3 will send a reply back to the client (hence the :reply atom in the response tuple), while handle_cast/2 does not do that. Basically casting is an async call where the client does not expect a response, while calling is a sync call where the response is expected.

So, let’s look at the structure of the handle_call/3 callback.

It takes three arguments: the request from the client (in our case a tuple), a tuple describing the client of the request (which we ignore), and the state of the server (in our case a map).

As a response, it returns a tuple with :reply, stating that there will be a reply to the request, the reply itself (in our case the shortened link) and the state which is the state carried over to the next loop of the server.

Of course, handle_call/3 has a bit more intricacies that we will look into later, but you can always check it’s documentation to learn more.

Fetching a shortened URL 🔗

Let’s implement the get command, which when provided with a short version of the link it will return the full URL:

defmodule URLShortener do
  use GenServer

  # Client API
  # ...

  def get(pid, short_link) do
    GenServer.call(pid, {:get, short_link})
  end

  # GenServer callbacks
  # ...

  def handle_call({:get, short_link}, _from, state) do
    {:reply, Map.get(state, short_link), state}
  end
end

The double-function entry pattern again - we add URLShortener.get/2 and another head of the URLShortener.handle_call/3 function.

The URLShortener.get/2 will call GenServer.call/2 under the hood, which when executed will cause the handle_call/3 callback to fire.

The URLShortener.handle_call/3 this time will take the command (:get) and the short_link as the first argument. Looking inside we see that, again, it’s a short function - it only returns a tuple with :reply (which states that the call will have a reply), a call to Map.get/2, whose return will be the actual response of the call, and the state, so the GenServer process maintains the state in the next loop.

At this moment, we can safely say that we have a good idea of the basics on writing functionality for a module that implements the GenServer behaviour. As you might be thinking, there’s much to explore, but these basics will allow you to create GenServers and experiment.

Before you go on, try to implement two more commands:

  • flush – an async call which will erase the state of the server
  • count – a sync call returning the number of links in the server’s state

More configurations 🎛

If we circle back to URLShortener.start_link/1 and it’s internals (namely the invocation of GenServer.start_link/3), we will also notice that we can pass options (opts) to the GenServer.start_link/3 function, that are defaulted to an empty list ([]).

What are the options that we can add here? By looking at the documentation of GenServer.start_link/3 you’ll notice multiple interesting options:

  • :name - used for name registration. This means that instead of identifying a GenServer by PID, we can put a name on it.
  • :timeout - sets the server startup timeout (in milliseconds)
  • :debug - enables debugging by invoking the corresponding function in the :sys module
  • :hibernate_after - sets the time (in milliseconds) after which the server process will go into hibernation automatically until a new request comes in. This is done by utilising :proc_lib.hibernate/3
  • :spawn_opt - enables passing more options to the underlying process

Most of these are advanced and are beyond our use-case here. However, there’s one configuration we could use right now: :name.

Naming the server 📢

Let’s modify our URLShortener to take a name in it’s start_link/1 function and test it in IEx. Additionally, since every URLShortener process will have a name, we can refer to the process by name instead of PID - let’s see how that would work in code:

defmodule URLShortener do
  use GenServer

  # Client API
  def start_link(name, opts \\ []) do
    GenServer.start_link(__MODULE__, :ok, opts ++ [name: name])
  end

  def stop(name), do: GenServer.cast(name, :stop)
  def shorten(name, url), do: GenServer.call(name, {:shorten, url})

  def get(name, short_link) do
    GenServer.call(name, {:get, short_link})
  end

  # GenServer callbacks
  def init(:ok), do: {:ok, %{}}
  def handle_cast(:stop, state), do: {:stop, :normal, state}
  def handle_call({:shorten, url}, _from, state), do: {:reply, md5(url), Map.put(state, md5(url), url)}

  def handle_call({:get, short_link}, _from, state) do
    {:reply, Map.get(state, short_link), state}
  end

  defp md5(url), do: :crypto.hash(:md5, url) |> Base.encode16(case: :lower)
end

That’s all. We added a new argument to URLShortener.start_link/2 and we dropped all usage of PID and replaced it with name.

Let’s take it for a spin in IEx:

iex(1)> {:ok, pid} = URLShortener.start_link(:foo)
{:ok, #PID<0.109.0>}

iex(2)> URLShortener.shorten(:foo, "https://google.com")
"99999ebcfdb78df077ad2727fd00969f"

iex(3)> URLShortener.get(:foo, "99999ebcfdb78df077ad2727fd00969f")
"https://google.com"

iex(4)> URLShortener.stop(:foo)
:ok

iex(5)> Process.alive?(pid)
false

You can see that this is pretty cool - instead of using PID we added a name :foo to the process which allowed us to refer to it using the name instead of the PID. Obviously, you can see that to inspect the BEAM process in any fashion we will still need the PID, but for the client the name does the trick.

This combination of name and PID allows us to have reference to the BEAM process while improving the ease of use for the client.

If we would like to simplify things even more, we can turn the URLShortener into a “singleton” server. Before you freak out - it has none of the drawbacks that the singleton pattern that’s infamous in OO programming has. We’re merely stating that we could change the URLShortener to have one and only one process running at a certain time, by setting a static name to it:

defmodule URLShortener do
  use GenServer

  @name :url_shortener_server

  # Client API
  def start_link(opts \\ []) do
    GenServer.start_link(__MODULE__, :ok, opts ++ [name: @name])
  end

  def stop, do: GenServer.cast(@name, :stop)
  def shorten(url), do: GenServer.call(@name, {:shorten, url})

  def get(short_link) do
    GenServer.call(@name, {:get, short_link})
  end

  # GenServer callbacks
  # ...
end

You can notice that we added a module attribute @name that holds the name of the process. In all the functions from the client API, we dropped the name from the arguments lists and we simply use @name as a reference to the process. This means that there’s going to be only one process for URLShortener with the name :url_shortener_server.

Let’s take it for a spin in IEx:

iex(1)> {:ok, pid} = URLShortener.start_link
{:ok, #PID<0.108.0>}

iex(2)> URLShortener.shorten("https://google.com")
"99999ebcfdb78df077ad2727fd00969f"

iex(3)> URLShortener.shorten("https://yahoo.com")
"c88f320dec138ba5ab0a5f990ff082ba"

iex(4)> URLShortener.get("99999ebcfdb78df077ad2727fd00969f")
"https://google.com"

iex(5)> URLShortener.stop
:ok

iex(6)> Process.alive?(pid)
false

You can notice that although we captured the PID on the first line, we do not need it at all - all of the work is done for us by URLShortener.

In this section, you saw how you can utilise names to more easily work with processes. Let’s review our full implementation of the URLShortener module.

Outro 🖖

Before we wrap up this long tutorial, let’s have a final look at our new URLShortener module, including the count/1 and flush/1 functions:

defmodule URLShortener do
  use GenServer

  # Client API
  def start_link(name, opts \\ []) do
    GenServer.start_link(__MODULE__, :ok, opts ++ [name: name])
  end

  def shorten(name, url) do
    GenServer.call(name, {:shorten, url})
  end

  def get(name, short) do
    GenServer.call(name, {:get, short})
  end

  def flush(name) do
    GenServer.cast(name, :flush)
  end

  def stop(name) do
    GenServer.cast(name, :stop)
  end

  def count(name) do
    GenServer.call(name, :count)
  end

  # Callbacks
  def init(:ok) do
    {:ok, %{}}
  end

  def handle_cast(:flush, _state) do
    {:noreply, %{}}
  end

  def handle_cast(:stop, state) do
    {:stop, :normal, state}
  end

  def handle_call({:shorten, url}, _from, state) do
    shortened = md5(url)
    new_state = Map.put(state, shortened, url)
    {:reply, shortened, new_state}
  end

  def handle_call({:get, short}, _from, state) do
    {:reply, Map.get(state, short), state}
  end

  def handle_call(:count, _from, state) do
    count = Map.keys(state) |> Enum.count
    {:reply, count, state}
  end

  defp md5(url) do
    :crypto.hash(:md5, url)
    |> Base.encode16(case: :lower)
  end
end

The two callbacks are faily simple - flush will just send a noreply and set the state to an empty map. count on the other hand will have a reply with the count of the items of the map, which is simply the numbers of keys there are in the state map. That’s all.

While you got to the end of the article your journey with GenServer does not end here. In fact, it just started. GenServers and OTP are very powerful tools that you can use to build generic servers that can live in small BEAM processes and have a very generic approach to building functionality (calls and callbacks).

While we did cover a lot of ground here we didn’t touch on why we named the starting function start_link instead of just start (hint: supervisors convention), or how we would approach testing such a GenServer like URLShortener.

In what kind of scenarios have you used GenServers? Or, if you do not have experience with it, where do you see yourself using it in the future?


Comments