Docker Compose, stray PID files, Rails and beyond

Table of Contents

Docker Compose + local development = ❤️ #

If you’re like me, you like a simple Docker Compose setup for your local project setup. It makes bringing up and tearing down different Docker containers easy with a single command. Especially if your application requires additional backing services, like a data store, a Docker Compose setup for local development makes it very easy to spin them all up in one go.

Another reason why I like Docker Compose because I am a bit of a purist when it comes to Dockerfiles – I believe that there should be only one Dockerfile for your service, and it should work for all environments. Nope, none of the Dockerfile.dev & Dockerfile.prod shenanigans. Why? Because I want my services to require only one set of dependencies, regardless of the deployment environment.

Following that idea, Docker Compose uses the project’s Dockerfile and orchestrates all backing services via a simple YAML file. And the cherry on top: I can check-in docker-compose.yml version control, and everyone in my organization can reap the benefits.

OK, OK, you know I like Docker Compose for local development. But, sometimes, it can be not very pleasant. For example, it doesn’t want to start a process or a server in a container because it never cleaned up its PID file.

When running docker-compose up, you might have seen one of these error messages (or a variation):

httpd (pid 1) already running

ERROR: Pidfile (celerybeat.pid) already exists.

A server is already running. Check /usr/src/app/tmp/pids/server.pid.

Seems familiar? Let’s work with a Rails application to reproduce the issue and fix it in four different ways.

Note: While this article uses a Rails application as an example to showcase the issue with stray PID files when using Docker Compose, the solution provided below is language- and framework-agnostic. In other words, one can use the approaches showcased below with any programming language or framework, as long as the application is containerized.

Dockerizing the app #

Let’s look at the Dockerfile of a simple Rails application, that uses no backing services. The Dockerfile will use the ruby:3.0.2-alpine3.14 image as base:

FROM ruby:3.0.2-alpine3.14

WORKDIR /app

# Install runtime dependencies
RUN apk add --no-cache \
  shared-mime-info \
  tzdata \
  sqlite-libs

# Install the project dependencies
COPY Gemfile* /app

# Bundle build dependencies
RUN apk add --no-cache --virtual build-dependencies \
  build-base \
  sqlite-dev \
  && bundle install \
  && rm -rf /usr/local/bundle/cache/*.gem \
  && find /usr/local/bundle/gems/ -name "*.[co]" -delete \
  && apk del --no-network build-dependencies

COPY . .

ENTRYPOINT ["/bin/sh", "-c"]

CMD ["bundle exec rails s -b 0.0.0.0 -p $SERVER_PORT"]

In the Dockerfile, we first set the working directory (WORKDIR) and then install a mix of build and runtime dependencies. Next, we copy the Gemfile and Gemfile.lock files to download and compile the bundle in the next step. But before we run bundle, we install some build tooling so Docker can build and install the gems.

Lastly, we COPY all project files to the image and run the Rails server using the rails s command. We COPY all files because we want all the project files to be inside the container to run. If we fail to copy a required file, the application will fail to boot, rendering our Docker image useless.

To run this container, we first need to build its image:

$ docker build -t jarjar:latest .
[+] Building 14.9s (11/11) FINISHED
 => [internal] load build definition from Dockerfile                                                     0.0s
 => => transferring dockerfile: 622B                                                                     0.0s
 => [internal] load .dockerignore                                                                        0.0s
 => => transferring context: 2B                                                                          0.0s
 => [internal] load metadata for docker.io/library/ruby:3.0.2-alpine3.14                                 0.6s
 => [1/6] FROM docker.io/library/ruby:3.0.2-alpine3.14@sha256:5bb06d7e3853903b9e9480b647b2d99ca289f9511  0.0s
 => [internal] load build context                                                                        0.0s
 => => transferring context: 18.47kB                                                                     0.0s
 => CACHED [2/6] WORKDIR /app                                                                            0.0s
 => CACHED [3/6] RUN apk add --no-cache   shared-mime-info   tzdata   sqlite-libs                        0.0s
 => CACHED [4/6] COPY Gemfile* /app                                                                      0.0s
 => [5/6] RUN apk add --no-cache --virtual build-dependencies   build-base   sqlite-dev   && bundle in  13.6s
 => [6/6] COPY . .                                                                                       0.0s
 => exporting to image                                                                                   0.5s
 => => exporting layers                                                                                  0.5s
 => => writing image sha256:c24befc2a0ce11f5550672e4f7aa64a5f6d8169609d173c35d039bf459bce757             0.0s
 => => naming to docker.io/library/jarjar:latest                                                         0.0s

After the image is built, we can create a container:

$ docker run --env SERVER_PORT=3000 -p 3000:3000 jarjar:latest
=> Booting Puma
=> Rails 6.1.4.1 application starting in development
=> Run `bin/rails server --help` for more startup options
Puma starting in single mode...
* Puma version: 5.5.2 (ruby 3.0.2-p107) ("Zawgyi")
*  Min threads: 5
*  Max threads: 5
*  Environment: development
*          PID: 1
* Listening on http://0.0.0.0:3000
Use Ctrl-C to stop

Voilá! Our container is off to the races!

Because we map the container port 3000 to the host machine’s port 3000, we can curl the Rails application from our host machine. Fortunately, we will get an HTTP 200:

$ curl -X GET -I localhost:3000
HTTP/1.1 200 OK
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Permitted-Cross-Domain-Policies: none
Referrer-Policy: strict-origin-when-cross-origin
Content-Type: text/html; charset=utf-8
Vary: Accept
Cache-Control: no-store, must-revalidate, private, max-age=0
Content-Security-Policy: script-src 'unsafe-inline'; style-src 'unsafe-inline'
X-Request-Id: a4befe81-30be-4859-b757-feac45f90b08
X-Runtime: 0.312531
X-MiniProfiler-Original-Cache-Control: max-age=0, private, must-revalidate
X-MiniProfiler-Ids: 8hq1e1quzqh4uacc9xb8
Set-Cookie: __profilin=p%3Dt; path=/; HttpOnly; SameSite=Lax
Content-Length: 400499

Now, instead of running our application with the complicated docker run command, let’s throw in a Docker Compose file and make our lifes easier:

version: '3.8'

services:
  http:
    build: .
    image: jarjar
    environment:
      SERVER_PORT: ${SERVER_PORT}
    ports:
      - ${SERVER_PORT}:${SERVER_PORT}  # Set to 4000 in .env file
    volumes:
      - .:/app

The file sets up the required environment variables, ports and volumes for the Rails application. To build our image now, we can use docker-compose build:

$ docker-compose build
Building http
[+] Building 0.7s (11/11) FINISHED
 => [internal] load build definition from Dockerfile                                                     0.0s
 => => transferring dockerfile: 622B                                                                     0.0s
 => [internal] load .dockerignore                                                                        0.0s
 => => transferring context: 2B                                                                          0.0s
 => [internal] load metadata for docker.io/library/ruby:3.0.2-alpine3.14                                 0.6s
 => [1/6] FROM docker.io/library/ruby:3.0.2-alpine3.14@sha256:5bb06d7e3853903b9e9480b647b2d99ca289f9511  0.0s
 => [internal] load build context                                                                        0.0s
 => => transferring context: 32.32kB                                                                     0.0s
 => CACHED [2/6] WORKDIR /app                                                                            0.0s
 => CACHED [3/6] RUN apk add --no-cache   shared-mime-info   tzdata   sqlite-libs                        0.0s
 => CACHED [4/6] COPY Gemfile* /app                                                                      0.0s
 => CACHED [5/6] RUN apk add --no-cache --virtual build-dependencies   build-base   sqlite-dev   && bun  0.0s
 => CACHED [6/6] COPY . .                                                                                0.0s
 => exporting to image                                                                                   0.0s
 => => exporting layers                                                                                  0.0s
 => => writing image sha256:c24befc2a0ce11f5550672e4f7aa64a5f6d8169609d173c35d039bf459bce757             0.0s
 => => naming to docker.io/library/jarjar                                                                0.0s

Now that we have the image built, we can run the service:

$ docker-compose up
Starting jarjar_http_1 ... done
Attaching to jarjar_http_1
http_1  | => Booting Puma
http_1  | => Rails 6.1.4.1 application starting in development
http_1  | => Run `bin/rails server --help` for more startup options
http_1  | Puma starting in single mode...
http_1  | * Puma version: 5.5.2 (ruby 3.0.2-p107) ("Zawgyi")
http_1  | *  Min threads: 5
http_1  | *  Max threads: 5
http_1  | *  Environment: development
http_1  | *          PID: 1
http_1  | * Listening on http://0.0.0.0:4000
http_1  | Use Ctrl-C to stop

We have our service running using Docker Compose! Let’s curl it, again:

$ curl -X GET -I localhost:4000
HTTP/1.1 200 OK
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
X-Content-Type-Options: nosniff
X-Download-Options: noopen
X-Permitted-Cross-Domain-Policies: none
Referrer-Policy: strict-origin-when-cross-origin
Content-Type: text/html; charset=utf-8
Vary: Accept
Cache-Control: no-store, must-revalidate, private, max-age=0
Content-Security-Policy: script-src 'unsafe-inline'; style-src 'unsafe-inline'
X-Request-Id: d7a72b16-9630-4e4c-9e9e-8781c8207b52
X-Runtime: 0.020082
X-MiniProfiler-Original-Cache-Control: max-age=0, private, must-revalidate
X-MiniProfiler-Ids: dvwfody0av0h45hub5bw,pp70s0op22fdolh4u8zt,hdpxqj8kohhs6r3jf7xp,c080m80j1zfarr0b4qxw
Set-Cookie: __profilin=p%3Dt; path=/; HttpOnly; SameSite=Lax
Content-Length: 400562

We get an HTTP 200. If we bring down the container and up again, we’ll run into a familiar problem:

$ docker-compose up
Starting jarjar_http_1     ... done
[...]
http_1      | => Booting Puma
http_1      | => Rails 6.1.4.1 application starting in development
http_1      | => Run `bin/rails server --help` for more startup options
http_1      | Exiting
http_1      | A server is already running. Check /app/tmp/pids/server.pid.
jarjar_http_1 exited with code 1

Apparently we have a stray PID file, and our server can’t boot.

Stray PID files #

The default web server for Rails, Puma, creates a PID file. A PID, shorthand for Process ID, is a file that contains only the ID of the server process on the operating system. So, if we boot our server container, we attach to it and open the PID file; we will see that the file contains only the PID number in it:

/app # cat tmp/pids/server.pid
1

Before I got to this section, I had no idea why PID files were useful. However, after consulting an excellent StackOverflow answer (that I recommend reading thoroughly), it appears that we can use PID files for:

as a signal to other processes and users of the system that that particular program is running, or at least started successfully
allows one to write a script to check if a particular process is running and issue a plain kill command (e.g., if one wants to end it)
it is a cheap way for a program to see if any of its previous instances did not exit successfully

That last point hits close to home - Puma’s PID file is still present before the container shuts down, and, when booting, Rails thinks there’s already another server running.

One thing that threw me off with the stray PID file is that containers and their file systems are supposed to be ephemeral. Once you stop them – they’re gone for good. What I found out is that Docker Compose reuses containers whose configurations have not changed:

Compose caches the configuration used to create a container. When you restart a service that has not changed, Compose re-uses the existing containers. Re-using containers means that you can make changes to your environment very quickly.

This is a neat feature of Compose, but it means that Compose does not destroy the container that contains the stray PID file. Actually, Compose reuses it. And it’s the reason why the lost PID file is detected, and Puma cannot start when using Docker Compose.

There are several ways to force Docker Compose to recreate the containers, but from a workflow perspective, we want docker-compose up to just do the trick. There’s always docker-compose up --force-recreate, but from a developer experience perspective, that seems… hacky.

If container reuse is a feature of Docker Compose, then the solution to our stray PID files must lie with Puma. In other words, how can we make Puma clean up its mess before exiting?

Cleaning up the mess #

For Puma to clean up its PID file, it has to receive the proper interrupt signal (SIGINT) when we press Ctrl-C so it can act upon it. But when we press Ctrl-C, even though we think we are interrupting Puma, we are not.

Inspect this output, carefully:

$ docker-compose up
Recreating jarjar_http_1 ... done
Attaching to jarjar_http_1
http_1  | => Booting Puma
http_1  | => Rails 6.1.4.1 application starting in development
http_1  | => Run `bin/rails server --help` for more startup options
http_1  | Puma starting in single mode...
http_1  | * Puma version: 5.5.2 (ruby 3.0.2-p107) ("Zawgyi")
http_1  | *  Min threads: 5
http_1  | *  Max threads: 5
http_1  | *  Environment: development
http_1  | *          PID: 1
http_1  | * Listening on http://0.0.0.0:4000
http_1  | Use Ctrl-C to stop

From STDOUT we see Puma saying Use Ctrl-C to stop, but when we press Ctrl-C we interrupt Docker Compose (its up command), not Puma. This is because upon pressing Ctrl-C we rely on Docker Compose to send that same SIGINT to the Puma process, but that is not how Compose works.

The fact that Docker Compose does not “bubble up” the SIGINT to Puma means that the Puma process is abruptly killed by the container shutting down, and Puma can’t clean up its PID file.

So, how can we make sure we remove the PID file?

Attempt no. 1: Remove PID before Puma boot #

The most straightforward way is to remove the PID file before we kick-off the Puma server. It can be done by modifying the CMD of the Dockerfile:

CMD ["rm -f tmp/pids/server.pid && bundle exec rails s -b 0.0.0.0 -p $SERVER_PORT"]

The Internet is riddled with variations of this approach. From Reddit threads, to many different Stack Overflow answers, “just remove the PID file” seems to be the default workaround for this problem.

Before I comment on this approach, let’s look at a similar but better approach.

Attempt no. 2: Use an entrypoint script #

This solution is suggested by Docker Compose’s documentation on Rails & Postgres:

…provide an entrypoint script to fix a Rails-specific issue that prevents the server from restarting when a certain server.pid file pre-exists. This script will be executed every time the container gets started.

The entrypoint.sh script, as taken from the documentation:

#!/bin/bash

# Stops the execution of a script in case of error
set -e

# Remove a potentially pre-existing server.pid for Rails.
rm -f /myapp/tmp/pids/server.pid

# Then exec the container's main process (what's set as CMD in the Dockerfile).
exec "$@"

This script will work in unison with the Dockerfile, which will have to make the script executable and run it as an ENTRYPOINT:

[...]

# Add a script to be executed every time the container starts.
COPY entrypoint.sh /usr/bin/
RUN chmod +x /usr/bin/entrypoint.sh
ENTRYPOINT ["entrypoint.sh"]

CMD ["bundle exec rails s -b 0.0.0.0 -p $SERVER_PORT"]

While the above solutions work and the second is even officially recommended by Compose, I do not like them for three reasons:

Any code we write, such as the script, will need to be checked in source control and maintained
The solution pollutes the Dockerfile due to a shortcoming of docker-compose up
The solution adds accidental complexity in the Dockerfile for any non-local environment (e.g. cloud). In other words, if we deploy our application to a cloud provider, we wouldn’t use Docker Compose, but we will still end up carrying the entrypoint script to the cloud.

For the above reasons, let’s take a step deeper and see if there’s a better/more self-contained solution.

Overriding Puma’s PID file #

When deploying your application to the cloud, it is critical to keep the Dockerfile minimal, containing only the absolute essentials. If you have seen bloated Dockerfiles, or even worse, a proliferation of Dockerfiles in a project, you know what I am talking about.

To keep the Dockerfile minimal, we must make tradeoffs: any potential local setup complexity will end up in the docker-compose.yml file.

Puma allows us to set the PID file path from the command line using the -p toggle. Unfortunately, this will add complexity to our Dockerfile:

CMD ["bundle exec rails s -b 0.0.0.0 -p $SERVER_PORT -P /some/path/server.pid"]

This is a no-go, as we want to stick to the defaults in any other (cloud) environment. The good news is that since 2019, Rails has supported setting the PID file path using an environment variable, allowing us to set the PIDFILE environment variable, and Rails will use it transparently.

OK, now that we have a way to transparently set the PID file path, what should its value be? Unix operating systems have one place where all bits go to die: /dev/null.

Attempt no. 3: Send the PID file to `/dev/null` #

This is easily achieved by adding a single line in the docker-compose.yml:

 version: '3.8'

 services:
   http:
     build: .
     image: jarjar
     environment:
       SERVER_PORT: ${SERVER_PORT}
+      PIDFILE: /dev/null
     ports:
       - ${SERVER_PORT}:${SERVER_PORT}
     volumes:
       - .:/app

It will work as expected if we test this out: docker-compose up can be ran and interrupted (using Ctrl-C) without problems.

But, by setting the PID file path to /dev/null we lose all valuable aspects of PID files: we can’t check whether a process is running or whether a previous instance has failed.

Can we do better? Could we place the PID file in an ephemeral location that would be present for the duration of the container life, but it would evaporate after Compose takes the container down?

Solution: Use `tmpfs` #

Docker ships with tmpfs mounts as a storage option that works only on containers running Linux. When we create a container with a tmpfs mount, the container can create files outside the container’s writable layer.

As opposed to volumes and bind mounts, a tmpfs mount is temporary and persists in the host memory. So when the container stops, the tmpfs mount is removed, and all files in it will be gone. tmpfs mounts seem like the perfect ephemeral storage for our problem.

To add a tmpfs path in our Compose file and to store the PID file in that path, we need to make the following changes:

version: '3.8'

services:
  http:
    build: .
    image: jarjar
    environment:
      SERVER_PORT: ${SERVER_PORT}
-     PIDFILE: /dev/null
+     PIDFILE: /tmp/pids/server.pid
    ports:
      - ${SERVER_PORT}:${SERVER_PORT}
    volumes:
      - .:/app
+   tmpfs:
+     - /tmp/pids/

Docker Compose will create a path and mount it in the host’s memory when we run docker-compose up. This will allow us to set the PID file in that ephemeral path, /tmp/pids in our example, and use the PID file during the duration of the container. When the container is torn down, we lose the temporary mount in the void with all PID files in it.

Let’s give this a shot and see if we can run docker-compose up and interrupt it ad infinitum.

First run:

$ docker-compose up
Creating jarjar_http_1     ... done
[...]
http_1      | => Booting Puma
http_1      | => Rails 6.1.4.1 application starting in development
http_1      | => Run `bin/rails server --help` for more startup options
http_1      | Puma starting in single mode...
http_1      | * Puma version: 5.5.2 (ruby 3.0.2-p107) ("Zawgyi")
http_1      | *  Min threads: 5
http_1      | *  Max threads: 5
http_1      | *  Environment: development
http_1      | *          PID: 1
http_1      | * Listening on http://0.0.0.0:4000
http_1      | Use Ctrl-C to stop
^CGracefully stopping... (press Ctrl+C again to force)
Killing jarjar_http_1      ... done

Looking good! Next run:

$ docker-compose up
Starting jarjar_http_1     ... done
[...]
http_1      | => Booting Puma
http_1      | => Rails 6.1.4.1 application starting in development
http_1      | => Run `bin/rails server --help` for more startup options
http_1      | Puma starting in single mode...
http_1      | * Puma version: 5.5.2 (ruby 3.0.2-p107) ("Zawgyi")
http_1      | *  Min threads: 5
http_1      | *  Max threads: 5
http_1      | *  Environment: development
http_1      | *          PID: 1
http_1      | * Listening on http://0.0.0.0:4000
http_1      | Use Ctrl-C to stop

Works as advertised!

Using a tmpfs mount, we can leave our Dockerfile unscathed without additional entrypoints scripts or dependencies. Instead, we are using a native Docker solution that has been around for a while, with just two extra lines in our Docker Compose file.

And maybe the best part is that our solution is stack agnostic. It will work with any process that creates a PID file, whether a web server (such as Puma) or another program. The tmpfs approach requires only the program to set a custom path for the PID file – that’s all!