Four Steps to Daemonize Your Go Programs
Table of Contents
If you have ever worked with Ruby, or have maybe maintained a Rails application, I am sure the name Sidekiq will sound familiar. For those unfamiliar with the project, Sidekiq is a job system for Ruby. It is a wildly popular project, and the author has turned it into a successful business.
None of the above would be relevant if Sidekiq’s author Mike Perham, in 2014, did not write a concise and informative post titled “Don’t Daemonize your Daemons!”. In it, he covers four guidelines to daemonizing programs correctly:
- Log to STDOUT
- Shut down on SIGTERM/SIGINT
- Reload config on SIGHUP
- Provide the necessary config file for your favorite init system to control your daemon
(You can also read the whole article on his website.)
So I was thinking, why don’t we explore how to apply these guidelines while daemonizing a Go program?
Website Observer #
The program in question is a simple command-line program that can monitor any website by sending periodic HTTP requests to it. If you ever heard of Datadog’s synthetic tests or Pingdom, think of our program as their little sibling.
The observer program will read its configuration from flags, environment
variables, or a configuration file. If the configuration is not present as a
flag, it will look into the ENV vars for it and then in a configuration file
(if present). If nothing is found, it will use the default value or exit with
an error depending on how crucial the configuration is.
To do this, we will use the namsral/flag
package, which is a drop-in replacement for
Go’s flag package, with the addition of parsing files and environment
variables. Being a drop-in replacement means that using the namsral/flag
package is as simple as using the flag package from the standard library.
First, observer will have a config type, which will encapsulate the
configuration for the website that it will observe:
const defaultTick = 60 * time.Second
type config struct {
	contentType string
	server      string
	statusCode  int
	tick        time.Duration
	url         string
	userAgent   string
}
func (c *config) init(args []string) error {
	flags := flag.NewFlagSet(args[0], flag.ExitOnError)
	flags.String(flag.DefaultConfigFlagname, "", "Path to config file")
	var (
		statusCode  = flags.Int("status", 200, "Response HTTP status code")
		tick        = flags.Duration("tick", defaultTick, "Ticking interval")
		server      = flags.String("server", "", "Server HTTP header value")
		contentType = flags.String("content_type", "", "Content-Type HTTP header value")
		userAgent   = flags.String("user_agent", "", "User-Agent HTTP header value")
		url         = flags.String("url", "", "Request URL")
	)
	if err := flags.Parse(args[1:]); err != nil {
		return err
	}
	c.statusCode = *statusCode
	c.tick = *tick
	c.server = *server
	c.contentType = *contentType
	c.userAgent = *userAgent
	c.url = *url
	return nil
}
The init function will take the command line arguments as input and build a
FlagSet, which represents a set of defined flags. Each of the flags is listed
and parsed; then, their values are assigned to the config. Additionally,
having the flag.DefaultConfigFilename as a flag as well enables our
observer to load the configuration from a config.conf file. The .conf
file has a key=value format, with new lines after each key-value pair.
Here’s the main function:
func main() {
	ctx := context.Background()
	ctx, cancel := context.WithCancel(ctx)
	c := &config{}
	defer func() {
		cancel()
	}()
	if err := run(ctx, c); err != nil {
		fmt.Fprintf(os.Stderr, "%s\n", err)
		os.Exit(1)
	}
}
Following Mat Ryer’s
advice,
we are going to keep main very thin while keeping the main logic of the
observer in the run method. main here just sets up the main context that
will propagate down to the run method, and it initializes the observer
config. Then it passes all of the relevant arguments to the run method.
Here’s the run method:
func run(ctx context.Context, c *config) error {
	c.init(os.Args)
	for {
		select {
		case <-ctx.Done():
			return nil
		case <-time.Tick(c.tick):
			resp, err := http.Get(c.url)
			if err != nil {
				return err
			}
			if resp.StatusCode != c.statusCode {
				log.Printf("Status code mismatch, got: %d\n", resp.StatusCode)
			}
			if s := resp.Header.Get("server"); s != c.server {
				log.Printf("Server header mismatch, got: %s\n", s)
			}
			if ct := resp.Header.Get("content-type"); ct != c.contentType {
				log.Printf("Content-Type header mismatch, got: %s\n", ct)
			}
			if ua := resp.Header.Get("user-agent"); ua != c.userAgent {
				log.Printf("User-Agent header mismatch, got: %s\n", ua)
			}
		}
	}
}
First, the run method initializes the config instance c, using the init
method. Then, it loops infinitely until the context ctx is done. When ctx
is done, it means the observer process is terminated, so it merely returns a
nil and finishes with its execution.
Alternatively, it will execute the other case every tick. By using the
time.Tick channel here we run this code by receiving a signal through the
channel every c.tick period. For example, if c.tick is 30 seconds, we will
receive a signal every 30 seconds, meaning the code will run every 30 seconds.
The code itself is simple β it sends an HTTP GET request to the URL assigned
to c.url. Once the response returns, the run method compares the relevant
response headers and the status code with the once provided through the
configuration. If any mismatch is detected, it logs the error.
Running the observer is relatively simple. One way is to supply a config file
through the command line:
$ ./observer -config ./config.conf
2020/04/23 19:41:54 Status code mismatch, got: 200
Alternatively, using flags:
$ ./observer -status=500 -tick=10s -url=https://ieftimov.com -server=Cloudflare
2020/04/23 19:43:34 Status code mismatch, got: 200
2020/04/23 19:43:34 Server header mismatch, got: cloudflare
2020/04/23 19:43:34 Content-Type header mismatch, got: text/html; charset=utf-8
We can do the same using environment variables, or a combination of all three: config file, environment variables, and flags.
Now, how can we apply the four simple rules of daemonization?
Logging to STDOUT #
While daemons don’t have much to do with web services, one of the 12 factors of
modern web services are treating logs as event streams. While the 12 factors in
this particular case are not applicable, the guiding principle stays: the
daemon itself should not manage log streams, nor it should not concern itself
with writing to or managing log files. Instead, daemons should send their log
stream, unbuffered, to STDOUT.
The service management system will capture each daemon’s stream. The init
config file is what we will use to configure logging, such as where the logs
should be stored or streamed.
So, how can we adapt the observer to log to STDOUT?
First, we will add another argument to the run function, called out of type
io.Writer. Then, we will invoke the log.SetOutput function passing the
out as argument to it.
func run(ctx context.Context, c *config, out io.Writer) error {
	c.init(os.Args)
	log.SetOutput(out)
	for {
		select {
		case <-ctx.Done():
			return nil
		case <-time.Tick(c.tick):
			// Identical to above, removed from brewity
		}
	}
}
By doing this, we will have to pass STDOUT from the main function, but we
keep our run function more testable. Using a separate run method means we
can invoke it with any instance that implements the io.Writer interface. We
basically couple the run method to a behavior instead of type.
Then, we need to update the main function to pass the additional argument to
the run function when invoking it. And the io.Writer will be simple
os.Stdout:
func main() {
	ctx := context.Background()
	ctx, cancel := context.WithCancel(ctx)
	c := &config{}
	defer func() {
		cancel()
	}()
	if err := run(ctx, c, os.Stdout); err != nil {
		fmt.Fprintf(os.Stderr, "%s\n", err)
		os.Exit(1)
	}
}
If we run the program again we won’t see a difference:
$ ./observer -status=500 -tick=10s -url=https://ieftimov.com -server=Cloudflare
2020/04/23 19:43:34 Status code mismatch, got: 200
2020/04/23 19:43:34 Server header mismatch, got: cloudflare
2020/04/23 19:43:34 Content-Type header mismatch, got: text/html; charset=utf-8
Why is that? Well, the log package logs to STDERR by
default, so
there is no visible change of behavior there. Still, we make the dependency on
an output stream explicit to the run function, which clearly states that
run needs to know where to send its logs when running.
Shut down on SIGTERM/SIGINT #
In Go, having errors as values is very helpful to think about what will happen to our program if an error is returned. While this makes our Go programs always have some repetitive error handling, it also gives us confidence that our program will gracefully handle any error.
Termination signals #
*nix operating systems (OS) employ a system of signals, which is a mechanism of the OS to ask a process to perform a particular action. There are two general types of signals: those that cause termination of a process and those that do not.
(Refer to the full list of the POSIX-defined signals to learn more.)
Using these system signals, a process that has received one can choose one of the following behaviors to take place: perform the default POSIX-defined action, ignore the signal, or catch the signal with a signal handler and perform some sort of a custom action.
Some signals that just can’t be caught or ignored; it means that the default
action has to happen. For example, SIGSTOP and SIGKILL are such signals.
Once a process receives any of these two signals, we just know that it will be
stopped/killed by the OS.
But other signals are more polite. While we cannot ignore them, they give a
chance to our process to clean up and go away with grace. Most of the ones on
the
list
are of the polite kind. In this section, we will look into the SIGTERM and
SIGINT signals and how we can treat them in our Go programs.
Handling SIGTERM & SIGNIT #
The os/signal package implements access
to incoming signals with the purpose of signal handling. Through the
Notify function, a Go program can
accept signals thorough a channel of type os.Signal.
In our observer’s case, we don’t have to do any cleanup once it receives a
SIGTERM/SIGINT. All we have to do is to stop further execution and shut
down gracefully. So, how can we achieve that?
First, we need to create a channel through which we will accept these two signals:
signalChan := make(chan os.Signal, 1)
signal.Notify(signalChan, syscall.SIGINT, syscall.SIGTERM)
Once the observer process receives a SIGINT or a SIGTERM, it will proxy
it through the signalChan channel. To process the signals, we would need to
create a goroutine that will receive signals through the signalChan. Once it
gets a signal, it will have to cancel() the context, which would stop the
further execution of the run method:
go func() {
        select {
        case = <-signalChan:
                log.Printf("Got SIGINT/SIGTERM, exiting.")
                cancel()
                os.Exit(1)
        case <-ctx.Done():
                log.Printf("Done.")
                os.Exit(1)
        }
}()
So, once the cancel function is executed, in the for loop of the run
method the execution will stop:
func run(ctx context.Context, c *config, stdout io.Writer) error {
        c.init(os.Args)
        log.SetOutput(os.Stdout)
        for {
                select {
                case <-ctx.Done():
                        return nil
                case <-time.Tick(c.tick):
                        // Same as above...
                }
        }
}
The last thing we need to do in the main function is to close the
signalChan channel when the programs exits:
func main() {
	// Same as above...
	defer func() {
                signal.Stop(signalChan)
		cancel()
	}()
	// Same as above...
}
The Stop function will stop relaying incoming signals to signalChan. When
Stop returns, it is guaranteed that signalChan will receive no more
signals.
Let’s run the observer program and see the signal handling in action:
$ ./observer -config=config.conf
2020/04/26 00:14:46 Status code mismatch, got: 200
...
2020/04/26 00:15:46 Status code mismatch, got: 200
Now, having the PID of observer, we can send any signal using the kill
command line tool:
$ kill -SIGINT 37212
By executing the kill command, we will send a SIGINT to the observer
process. This will force observer to wrap up the execution, log a line to
STDOUT and exit:
$ ./observer -config=config.conf
2020/04/26 00:29:22 Status code mismatch, got: 200
2020/04/26 00:29:22 Got SIGINT/SIGTERM, exiting.
exit status 1
We can try the same exercise with SIGTERM as well:
βΊ kill -SIGTERM 37827
Causes observer to exit with the same behavior:
$ ./observer -config=config.conf
2020/04/26 00:33:42 Status code mismatch, got: 200
2020/04/26 00:33:44 Got SIGINT/SIGTERM, exiting.
exit status 1
Reload config on SIGHUP #
Now that we know how to handle signals, we need to add another signal to the
mix - SIGHUP. To do that, we can just add syscall.SIGHUP to the
signal.Notify call:
signalChan := make(chan os.Signal, 1)
signal.Notify(signalChan, syscall.SIGINT, syscall.SIGTERM, syscall.SIGHUP)
Now that we have SIGHUP covered, in the goroutine that handles the signals,
once a SIGHUP is received, it should re-run the config.init method. By
doing that, we will reload the configuration of the observer, loading any
changes in the configuration:
go func() {
        select {
        case s := <-signalChan:
                switch s {
                case syscall.SIGINT, syscall.SIGTERM:
                        log.Printf("Got SIGINT/SIGTERM, exiting.")
                        cancel()
                        os.Exit(1)
                case syscall.SIGHUP:
                        log.Printf("Got SIGHUP, reloading.")
                        c.init(os.Args)
                }
        case <-ctx.Done():
                log.Printf("Done.")
                os.Exit(1)
        }
}()
The change is relatively small. By using a switch construct, detect the
received signal. If it’s a SIGHUP, we invoke c.init(os.Args). Otherwise,
we cancel() the context and os.Exit the program.
We can test this using the same trick from before:
$ kill -SIGHUP 38761
Will cause the observer to reload:
$ ./observer -config=config.conf
2020/04/26 01:15:40 Status code mismatch, got: 200
2020/04/26 01:15:44 Got SIGHUP, reloading.
This looks nice. Let’s shut down the server now by sending a SIGTERM:
$ kill -SIGTERM 38761
In case you’re following along, you will find out that the observer is still
running; this is a bug βΒ the goroutine that was receiving signals exited
because the select construct completed once it received the SIGHUP.
To make the goroutine accept signals without exiting, we need to make the
goroutine run infinitely β using a for loop:
go func() {
        for {
                select {
                case s := <-signalChan:
                        switch s {
                        case syscall.SIGINT, syscall.SIGTERM:
                                log.Printf("Got SIGINT/SIGTERM, exiting.")
                                cancel()
                                os.Exit(1)
                        case syscall.SIGHUP:
                                log.Printf("Got SIGHUP, reloading.")
                                c.init(os.Args)
                        }
                case <-ctx.Done():
                        log.Printf("Done.")
                        os.Exit(1)
                }
        }
}()
By wrapping the whole goroutine in a for loop, we will make sure that it will
not exit, except when a SIGINT/SIGTERM is received, or if the context is
done. By having this endless goroutine, we also can send multiple SIGHUPs to
the observer, and it will process them correctly.
Let’s send two SIGHUPs, to perform two reloads, and SIGTERM to shut down
the observer:
$ kill -SIGHUP 38960
$ kill -SIGHUP 38960
$ kill -SIGTERM 38960
And the observer output:
$ ./observer -config=config.conf
2020/04/26 01:25:02 Status code mismatch, got: 200
2020/04/26 01:25:03 Got SIGHUP, reloading.
2020/04/26 01:25:05 Got SIGHUP, reloading.
2020/04/26 01:25:08 Got SIGINT/SIGTERM, exiting.
And that’s it. The observer now knows how to log to STDOUT, gracefully exit
when it receives SIGINT or SIGTERM and reloads the configuration when it
receives a SIGHUP.
Provide the necessary config file for your favorite init system to control your daemon #
Now, given that computer of choice is a MacBook, I will explain here how you
can create a config file for launchd β macOS’s
service management framework for starting, stopping and managing daemons,
applications, processes, and scripts. In macOS, the system runs daemons, while
the users run programs as agents. So, we will turn our observer into an
agent.
In the past, I have written about creating and managing macOS
agents, so if you would like to
more about this topic, you can head read that as well. Still, let’s see a
minimal launchd configuration for observer:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
    <dict>
        <key>Label</key>
        <string>com.ieftimov.observer</string>
        <key>RunAtLoad</key>
        <true/>
        <key>KeepAlive</key>
        <true/>
        <key>ProgramArguments</key>
        <array>
          <string>/usr/local/bin/observer</string>
          <string>-config</string>
          <string>/etc/observer.conf</string>
        </array>
        <key>StandardOutPath</key>
        <string>/tmp/observer.log</string>
        <key>StandardErrorPath</key>
        <string>/tmp/observer.error.log</string>
    </dict>
</plist>
The configuration is relatively straightforward, here are all of the pieces in order:
- The Labelidentifies the job and has to be unique for the launchd instance. Think of it as a unique name for the given agent.
- RunAtLoadmeans- launchdwill start the job as soon as it loads it.
- KeepAlivetells launchd to keep the agent running no matter what.
- ProgramArgumentsprovides command-line options to the agent command. In our case, this will create the following command:- /usr/local/bin/observer -config /etc/observer.conf.
- StandardOutPathand- StandardErrorPathare the paths to where- launchdwill write the respective output. In our case, we write these to the- tmpdirectory. An alternative would be to add the log files to- /var/log, but that requires granting write access of the agent to- /var/log.
To make sure we can run the agent, we have to also supply the configuration
file observer.conf in the /etc directory. On my machine, its contents are
as follows:
status=500
tick=30s
url=https://ieftimov.com
server=cloudflare
content_type=text/html; charset=utf-8
user_agent=
After placing the observer.conf file in /etc, for the agent to work, we
have to place its .plist file in ~/Library/LaunchAgents, and load it with:
$ launchctl load ~/Library/LaunchAgents/com.ieftimov.observer.plist
Now, if we would tail -f the log files in /tmp we will see its outputs
there:
$ tail -f /tmp/observer.*
==> /tmp/observer.error.log <==
==> /tmp/observer.log <==
2020/05/02 11:49:03 Status code mismatch, got: 200
Voila! The agent is running and its logging output to STDOUT, while launchd
is redirecting that output to a log file.
If we would like to run the observer on GNU/Linux, we cannot use this
launchd configuration.
In Linux-land, systemd is widespread and popular. If you are interested in a
deeper explanation of systemd units and unit files, Digital Ocean’s blog has
an
article
on “Understanding Systemd Units and Unit Files” by Justin Ellingwood. I
recommend reading it! And keep in mind, the community’s opinion on systemd
is pretty
divided.
There are a bunch of other alternatives, but my knowledge of GNU/Linux init systems is minimal. Therefore, I will stop right here and ask for your help:Β if you would like to contribute a Linux init system configuration to this article, drop the link to a gist/repo in the comments, and I will include it in this article.
Of course, with proper attribution.
You can see the final implementation of the observer here.