No test suite is perfect. Some test suites are missing good helper functions; others are under-configured or over-customize. Some have obsolete packages included and are left unmaintained. Folks that have experience with more mature projects will likely agree that all of the above can be found in the wild.
Often, when we test our Go programs, need to create files. Such files can be just fixture files, or whole file trees, to set up the correct environment for the tests to run. Once the tests finish running, we have to clean them up, so they don’t linger around.
We should rely on the test suite’s set of helpers to provide us with a way to manage test files if they exist in the first place. Unfortunately, not all test suites have such clean-up helpers set up. Sometimes, you might find a few different implementations, instead of one obvious way to do it.
Coming up in Go v1.15, the
will improve the support for creating temporary test files and directories.
Let’s see how we can put them to use.
Converting PDFs to TXT files
While thinking about a small program that will aid our understanding of leaking test files, I was asking myself, “What is a program that generates files?”. Because I work for a company that does a lot with documents, I thought, “let’s do something straightforward with PDFs”.
Imagine we have a program that extracts text out of PDFs. Why? Well, for one, if we want to know how long it will take us to read the PDF, we can take the number of words in a PDF and divide it by the average reading speed (which, according to Google, is 250 words per minute).
But to do that, we first have to take a PDF and create a TXT file with all of the sentences inside. To do that, we can use one of the many PDF parsing libraries for Go.
Here’s the code that takes a PDF, extracts all rows of text from it, and saves them in a TXT file.
It takes some bytes as
content and persists them to a
io.Writer - it can be
a file, a
strings.StringBuilder or a different type that implements the
io.Writer interface. The argument types are generic (interfaces) by design,
so the function arguments are more liberal. You will see why when we test this
slurp function is next. It will take a
pdf.Reader type, which is a type
By returning a slice of bytes, instead of a string, we can use a generic
interface such as
io.Writer (like in
persist). The interface is applicable because the
Write function it
implements takes a slice of bytes as an argument - making the arguments of
slurp and the returned values of
Next, the function that ties them all together –
run’s role is to check the arguments received from
main and the
to where it should send all of its output. Then it opens the PDF file for
reading and passes the file reference to the
slurp returns the contents of the file, which is just a slice of bytes
run will create a new file for writing, called
txtFile. Once it opens the
file, it will send the
contents and the
persist as arguments.
As we already saw above,
persist will save the
contents to the file and
return any potential errors. If no errors are returned,
Lastly, the straightforward
Locally, I have a simple PDF with some text inside that I’ve found online. Its contents, according to the author, are popular interview questions. We will run the above program with any PDF, and as long as it finds some text inside it will save it to an output file.
Here it is in action:
That’s really it. Our program took the contents from
input.pdf and stored
out.txt. Let’s see how we can test this program.
persist function does not do much. In fact it just invokes the
function of the
io.Writer instance. Since we are using a file, that is part
of the standard library, we do not need to test it. But, given that there’s
some error handling, which is a custom implementation, we can add some tests to
strive to get to that full test coverage.
TestPersist, in all of its glory:
Each of the test cases, part of the table-driven tests, contains the
the test case, the
content that it will persist, and the
which will create the output file.
In the test itself, we create a subtest for each of the test cases,
which will try to write the
content to a file, and then it will read all of
the content back from the file. If the persisted content and the test case
content are the same, then the test successfully passes.
If we run the tests, this is the output we will see:
We ran the two test cases where we try to persist a file with no content and some content. Both of them passed, and we can move on!
slurp function is more involved. It requires two different test files
– two dummy PDFs with some content and no content (empty). Then, by passing the
two different files to
slurp, we can test if extracting the text from the PDF
works as intended.
This is the test:
Each of the test cases from the table-driven tests will have a
is an actual PDF file on disk. For each of the test cases, a subtest will be
run, which will open the PDF using the
ledongthuc/pdf library. We will then
pass the reference to the PDF file to the slurp function, expecting the
contents to be returned by
Once it returns the
content, we simply compare the
size - the number of
bytes that are expected (
tc.size), comparing it against the
which is the size of the bytes returned. If the sizes match, we assume that the
content is correct, and the test will pass.
Here’s the test in action:
run function is what glues everything together. It validates the
arguments, then opens the PDF for reading, slurps all of the contents using
slurp, and lastly saves all of the text content to the TXT file using
Here’s the test:
TestRun, we check if, for each of the test PDFs we provide, the
function crates the corresponding TXT file. In
TestRun, we do not care about
the actual contents – we can assume that the rest of the unit tests covers that
part of the functionality.
Then, for each test case, we use
os.Stat, which will return an error if the
file does not exist. If the file does exist, we consider the
run function as
properly functioning and mark the test as passed.
Here’s the test in action:
Another test we can also run is to test the returned errors. We will create
another function called
TestRunErrors, which will cover the potential errors
run. Here’s the test function:
TestRunErrors is similar to
TestRun, with having the focus on the
returned errors. It checks that for each of the bad inputs the
receives, that it returns an error. We could take this a step further by
errors and asserting on
them, but this will do just fine this article.
TestRunErrors function in action:
$ go test -v -run TestRunErrors === RUN TestRunErrors === RUN TestRunErrors/WithoutArguments === RUN TestRunErrors/WithoutOneArgument === RUN TestRunErrors/WithNonexistentInput --- PASS: TestRunErrors (0.02s) --- PASS: TestRunErrors/WithoutArguments (0.00s) --- PASS: TestRunErrors/WithoutOneArgument (0.02s) --- PASS: TestRunErrors/WithNonexistentInput (0.00s) PASS ok github.com/fteem/go-playground/testing-in-go-leak-test-files 0.140sbash
Cleaning up after our test
If you were following along, you should notice that test files are created in your project’s directory. The files stay there because we never clean up the output files that our program creates when the tests run.
Starting from Go v1.15, there will be a nice way to do this:
To clean up the test files, we can use
TB.TempDir as a parent directory
wherever we are passing the output file path. Once the tests pass, Go will
automatically get rid of this directory, without us having to do any clean-up.
First, let’s see how we can change the
TestPersist function to clean up the
non-empty.txt files it creates:
The only notable change is to use
t.TempDir() and the
file name as arguments. This combo will compose a valid temporary path, that Go
will remove once the tests finish. Given that at the time of writing this, Go
1.15 is still not out, we can use the
gotip tool to run Go’s latest
If we inspect the project root, we will see that no new files are being created. The output files are cleaned up after the tests have finished running.
gotip tool compiles and runs the
go command from the development tree.
gotip command, instead of the normal
go command, will run the
latest version of the language, as seen in the main Git trunk.
You can see its documentation for more details.
Next, we can do the same change to the
TestRun function, we use the same trick - we use the
function to concatenate the path of the
output file. Running the test the
same way, we can see
T.TempDir once more in action:
If you want to check out the motivation and the discussion around this addition to the new Go version 1.15, you can head over to the original proposal.