I remember it was summer and it the A/C was blasting in the office. My mentor and I were looking at one of the screens on my desk. It was the biggest screen I’ve ever worked on - a 24 inch 720p Dell. It meant a lot to me - I felt like a real programmer, whatever that means, looking at these vast areas of black screen and many split terminals, with tons of white characters on the black background.

“Let’s try stubbing this method call here - the problem might be in its invocation” I said. He agreed, it’s worth a shot. We were deep in this debugging session. We felt that we were close to the fix but we still had couple of more steps. Upon changing the stub a different failure greeted us. “Oh, a different error?” I said, with a dose of frustration in my voice. “Yes, but that’s progress!” he said, excited at the prospect of us approaching our debugging destination.

Later that day, after we fixed the issue, we were in the kitchen, making another cup of coffee. We were expressing eachothers frustration with the architecture, reflecting on how we could simplify it, improve the code and make it more testable. We continued about how we would design this feature in the perfect world, if there was such a thing or place. It was always an interesting thought exercise.

Then, I asked him about his remark that a different error means progress. For me, it was a strange way of looking at errors. The idea seemed somehow provocative at that time. How could errors mean progress? How are errors even good? What is he talking about? These were some of the questions that were reverberating in my head. After a short discussion I realised that while we all appreciate code that works, we spend much of our working hours debugging existing code. When fixing existing code, what our test failures communicate is paramount to the debugging experience we have.

That’s why in this article we will look at what it means to write a meaningful test failure message. We will look at its structure and how we can use some simple techniques to improve our test failure messages. Whether it’s a colleague of ours, or our future selves, a great error message can make people’s lifes easier and they’ll be grateful for it.

Anatomy of a test case

Before we continue on exploring failure messages and their traits, we should create a function that we can test. When talking about test failures without having some actual code to test would be a waste of our time. Let’s start simple - a very small function Max that receives a slice of ints as arguments and returns the biggest int:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
// max.go
package main

func Max(numbers []int) int {
	var max int

	for _, number := range numbers {
		if number > max {
			max = number
		}
	}

	return max
}

The function traverses the slice of ints and it compares each of its elements, assigning the larger one to the max variable. At the end, it returns the max as the result. We will use table-driven tests to test this function. After that, we can disuss its anatomy.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
func TestMax(t *testing.T) {
	cases := []struct{
                input []int
                expected int
        }{
		{
			input:    []int{1, 2, 3, 4, 5},
			expected: 100,
		},
		{
			input:    []int{-1, -2, -3, -4, -5},
			expected: 100,
		},
		{
			input:    []int{0},
			expected: 1,
		},
	}

	for _, c := range cases {
		actual := Max(c.input)
		expected := c.expected

		if actual != expected {
			t.Fatalf("Expected %d, got %d", expected, actual)
		}
	}
}

Running the tests will produce the following output:

› go test
--- FAIL: TestMax (0.00s)
    max_test.go Expected 100, got 5
FAIL
exit status 1
FAIL	_/Users/Ilija/Documents/testing	0.004s

Great, now that we have something working, we can dissect our test function. The function has three main parts:

  1. The setup: in our example we use table-driven tests, where each of the structs has an input and an expected output. When the input is passed to the function, the expected result is the expected attribute of the case struct. When testing more complicated functions, this is where we would do any other setup that would be mandatory for the test to run. Some examples are loading more complicated fixtures (from fixture files) or opening a connection to a database and querying some data.

  2. Building the assertion: this is where the testing happens. In table driven tests, this usually means that we have some sort of a for loop that traverses the structs and passes the input to the function under test (in this case Max). Then, we compare the output and the expected value, which will inform us if the function has passed the test.

  3. Reporting: when the assertion is false, meaning the result and the expected value are not the same, the test function has to report that something went wrong. This is where we will focus for the rest of this article.

Now that we understand the anatomy of a common test case, let’s open the discussion of what are the traits of a good and a bad test failure message.

Join The Newsletter

I write about backend technologies, programming and cloud architectures. Join hundreds of other developers that get my content, twice a month. Unsubscribe whenever. Never any spam, ads, or affiliate links.

You can also subscribe via RSS.

Traits of an failure message

To define the traits of an test failure message, let’s examine the output of the failed test from the previous section:

› go test -v
=== RUN   TestMax
--- FAIL: TestMax (0.00s)
    max_test.go Expected 100, got 5
FAIL
exit status 1
FAIL	_/Users/Ilija/Documents/testing	0.004s

It has three segments:

  1. Test status (pass or fail)
  2. Function name that was run and failed
  3. Failure message(s)

Although the failure message is the one where we get most of the information about the failure of the test, we rely on the output as a whole to get useful information/context about the failure. In the failure message, we can see that it contains two parts: what was expected and what was received.

Is this good enough? I would say: yes and no.

Yes, because it succintly communicates what our test case expected, and what it received. No, because it does not provide any other context, although there is plenty of it in the test function. This hidden knowledge can hinder us when fixing the test.

So, back to the original question: what are the traits of a bad and a good test failure message?

Bad failure messages in failed test(s) hide data and behaviour from the programmer, obstructing them in fixing the failing test(s). Good failure messages, on the other hand, contain just enough information so the programmer can make the test pass.

Here’s a list of data points that we could expose in our test failure messages:

  • the line number of the failed test in the source file
  • the source file of the test itself
  • the expression that failed
  • the left and right values in the equality comparison
  • the surrounding state - values of the variables participating in the expression that failed

Keep in mind that these are just guidelines and there is no hard and fast rule. There have been times where my tests have included enough information, and still debugging and fixing them was hard. There are many times when a small refactoring exercise can do wonders for your functions - make sure you run the tests often and add some more as you go.

In theory such rich failure messages should be useful. But, how do we actually create them in practice? At the center of all this is a very simple and common technique - inspecting values of types.

Inspecting Primitive Types

Golang has a quite a list of primitive types and always having a value is a trait they share. Even variables that were only declared have a value without explicitly defining it, called zero values. Inspecting primitive types in a test failure means we only have to look at their value.

Unsurprisingly, this is simple in Golang. To inspect a value of a primitive type we only need to print its value using a comibnation of fmt’s Sprintf and Println functions. Sprintf requires a format verb, which will be %v (stands for value).

Here’s a simple program that inspects a few primitive types, priting their values:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
package main

import "fmt"

func main() {
	fmt.Println(fmt.Sprintf("Boolean: %v", true))
	fmt.Println(fmt.Sprintf("Integer: %v", 42))
	fmt.Println(fmt.Sprintf("String: %v", "Hello world"))
	fmt.Println(fmt.Sprintf("Float: %v", 3.14))
}

The main function prints four different values, each of them of a primitive type. We’ve added the type names in the printing statements to aid visual inspection. If we would run it, the program would produce the following output:

› go run whatever.go
Boolean: true
Integer: 42
String: Hello world
Float: 3.14

This is a very simple way to inspect any values of primitive type - pass them to Sprintf (with the %v formatting verb) and send its output to Println. If we look at our tests from earlier, you will notice that we actually already use it:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
// Snipped...
for _, c := range cases {
	actual := Max(c.input)

	if actual != c.expected {
		out := fmt.Sprintf("Running: Max(%v)\n", c.input) +
			fmt.Sprintf("Argument: %v \n", c.input) +
			fmt.Sprintf("Expected result: %d\n", c.expected) +
			fmt.Sprintf("Actual result: %d\n", actual)
		t.Fatalf(out)
	}
}

We print the input and the argument using the %v verb and we print the expected and actual results using the %d – denoting a base 10 integer representation of the value. Using this approach we are able to send any output to the person battling the failed specs. Here are some other formatting verbs that we can use:

Boolean:
- %t	the word true or false

Integer:
- %b	base 2
- %c	the character represented by the corresponding Unicode code point
- %d	base 10
- %o	base 8

String:
- %s	the uninterpreted bytes of the string or slice
- %q	a double-quoted string safely escaped with Go syntax

There are more verbs available in the fmt package, head over to the documentation to check them out.

Inspecting Custom Types

When it comes to inspecting state of custom types, things can easily get hairy. All custom types start simple, with few attributes. But as codebases grow, the size and complexity of custom types can (read: will) grow. A test that once used a very simple type to check a behaviour of certain function, now might produce an output containing a huge struct.

So, in such cases how can we show only the relevant values when the tests fail?

Just like with the primitive types, printing the internals of a custom type is a simple exercise in Go - using the fmt package’s Sprintf, using the verbs %v and %+v.

To look at Sprintf’s workings in combination with structs, we will create a type Person. It has two attributes age (with type int64) and name (with type string):

1
2
3
4
type Person struct {
	age  int64
	name string
}

The Person will implement a function older which will return a boolean, when a Person is older than another Person. It will do that by comparing the ages of the two Person structs:

1
2
3
func (p *Person) older(other *Person) bool {
	return p.age > other.age
}

Having the topic of testing in the focus here, let’s also add a test function for the older function:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
func TestOlder(t *testing.T) {
	cases := []struct {
		person1  *Person
		person2  *Person
		expected bool
	}{
		{
			person1: &Person{
				name: "Jane",
				age:  22,
			},
			person2: &Person{
				name: "John",
				age:  23,
			},
			expected: false,
		},
		{
			person1: &Person{
				name: "Michelle",
				age:  55,
			},
			person2: &Person{
				name: "Michael",
				age:  40,
			},
			expected: true,
		},
		{
			person1: &Person{
				name: "Ellen",
				age:  80,
			},
			person2: &Person{
				name: "Elliot",
				age:  80,
			},
			expected: true,
		},
	}

	for _, c := range cases {
		actual := c.person1.older(c.person2)

		if actual != c.expected {
			out := fmt.Sprintf("Running: older(%v)\n", c.person2) +
				fmt.Sprintf("Argument: %v \n", c.person2) +
				fmt.Sprintf("Expected result: %t\n", c.expected) +
				fmt.Sprintf("Actual result: %t\n", actual)
			t.Fatalf(out)
		}
	}
}

As you can see, we use the same table-driven tests approach and similar formatting for the failure output. If we run the test, we should expect the last test case to fail, because the two persons that we will compare are not older than the other.

› go test
--- FAIL: TestOlder (0.00s)
    person_test.go Running: older(&{80 Elliot})
        Argument: &{80 Elliot}
        Expected result: true
        Actual result: false
FAIL
exit status 1
FAIL	_/Users/Ilija/Documents/testing_person	0.004s

Here we can observe a similar output to the one we had in the previous examples. The only difference is in the way the struct is formatted: while we see the values of its attributes, we cannot see the names of the attributes. To fix this, we can apply the %+v formatting verb, which will produce the following output:

› go test
--- FAIL: TestOlder (0.00s)
    person_test.go Running: older(&{age:80 name:Elliot})
        Argument: &{age:80 name:Elliot}
        Expected result: true
        Actual result: false
FAIL
exit status 1
FAIL	_/Users/Ilija/Documents/testing_person	0.004s

The %+v formatting verb when printing structs adds field names to the values, so it is easier for us to understand the state of the struct. Obviously, this way of printing all attributes can be problematic when we are facing big structs.

In such cases, there’s a neat trick that we can use: defining a String function for the struct. By having a String method, our type will implicitly implement the Stringer interface. A Stringer is a type that can describe itself as a string. The fmt package (and many others) look for this interface to print values.

If we would implement a String function for our type, our custom type will be able to describe itself. The cool part is that we could have the String function in the test file itself (person_test.go), or in the file with the type definition (person.go). Wherever Golang finds the function it will use it when printing the struct.

1
2
3
4
5
6
// person_test.go
func (p *Person) String() string {
	out := fmt.Sprintf("\nAge: %d\n", p.age) +
		fmt.Sprintf("Name: %s\n", p.name)
	return out
}

By having the String function, we can change the approach we use in the failure reporting in the test itself:

1
2
3
4
5
6
if actual != c.expected {
        out := fmt.Sprint("Argument: ", c.person2) +
                fmt.Sprintf("Expected result: %t\n", c.expected) +
                fmt.Sprintf("Actual result: %t\n", actual)
        t.Fatalf(out)
}

Note that in the line where we build the output we use just fmt.Sprint with the argument of c.person2, which is the struct itself. We do not specify what attributes will be printed, the String function takes care of everything.

The output will be:

› go test
--- FAIL: TestOlder (0.00s)
    person_test.go Argument:
        Age: 80
        Name: Elliot
        Expected result: true
        Actual result: false
FAIL
exit status 1
FAIL	_/Users/Ilija/Documents/testing_person	0.004s

By implementing the Stringer interface, we let our Golang’s fmt package take care of the printing.

But, could we take this idea further? Imagine if the structs that we use to build our table-driven tests could actually describe themselves?

Join The Newsletter

I write about backend technologies, programming and cloud architectures. Join hundreds of other developers that get my content, twice a month. Unsubscribe whenever. Never any spam, ads, or affiliate links.

You can also subscribe via RSS.

Self-describing test cases

From our test that we wrote earlier, let’s extract a type (TestCase) and use it in our TestOlder test function:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
type TestCase struct {
	person1  *Person
	person2  *Person
	expected bool
}

func TestOlder(t *testing.T) {
	cases := []TestCase{
		{
			person1: &Person{
				name: "Ellen",
				age:  80,
			},
			person2: &Person{
				name: "Elliot",
				age:  80,
			},
			expected: true,
		},
	}

	for _, c := range cases {
		actual := c.person1.older(c.person2)

		if actual != c.expected {
			out := fmt.Sprint("Argument: ", c.person2) +
				fmt.Sprintf("Expected result: %t\n", c.expected) +
				fmt.Sprintf("Actual result: %t\n", actual)
			t.Fatalf(out)
		}
	}
}

(I also removed two of the test cases for brewity.)

This will not change the output of the test - we extracted a type that we defined inline in the previous versions of the test. If we think about the failure output that our test function produces, couldn’t our new type TestCase implement the Stringer interface too? What if we define a String function on TestCase which will print out every test failure in the same, structured format, standardized across all our test functions in this test file?

Here’s an example of a String function that TestCase can implement:

1
2
3
4
5
6
7
func (tc TestCase) String() string {
	out := fmt.Sprint("Person 1: ", tc.person1, "\n") +
		fmt.Sprint("Person 2: ", tc.person2, "\n") +
		fmt.Sprintf("Expected result: %t\n", tc.expected)

	return out
}

To put this function in action, we need to make a small modification to the assertion in the test function:

1
2
3
4
5
6
7
8
9
for _, c := range cases {
        actual := c.person1.older(c.person2)

        if actual != c.expected {
                out := fmt.Sprint(c) +
                        fmt.Sprintf("Actual result: %t", actual)
                t.Fatalf(out)
        }
}

And the output, on failure, will look like:

› go test
--- FAIL: TestOlder (0.00s)
    person_test.go Person 1:
        Age: 80
        Name: Ellen

        Person 2:
        Age: 80
        Name: Elliot

        Expected result: true
        Actual result: false
FAIL
exit status 1
FAIL	_/Users/Ilija/Documents/testing_person	0.004s

Now, we can see very clearly what was the expected result, the actual result and the state of the variables that are in play for the particular test case. We can improve the formatting of the output even more, but for our purposes this will do fine.

I hope that this makes it clear that tests are just code. If we use Golang’s power to construct test cases as structs and use them in tests, we can also use other Golang goodies like the Stringer interface and make our test cases report their failures better.

Simplicity and verbosity

Before we wrap up this walkthrough that we embarked on, there’s one last thing that I would like us to discuss. I would go on a limb and say that you are probably asking yourself one of these two questions already:

Is there a simpler way to do this? Why can’t we let a testing library or framework take care of the failures messages for us?

The answer would be: sure, you can totally do that.

But before you opt-in to use a testing library, I want to you understand few things so you can make better choices for your projects in the future.

  1. Go is simple by design. If you look at the The Go Programming Language Specification you can see how simple and short it is. This is a feature of the language, not a shortcoming.
  2. Go ships with a lightweight testing framework. It has out of the box support for writing and running test files, which many other languages do not have.
  3. Tests in Go are just code, so they should be as simple as any Go code. Naming a test file, writing the tests and running them should all be simple actions. Analogous to that, a test assertion should be a very simple comparison between two values.

Having these three principles in mind, I hope you can appreciate why writing some failures messages in Golang tests can be a verbose but simple thing to do. As Rob Pike, one of Golang’s authors, says in his “Go Proverbs” talk:

Clear is better than clever… There are languages where cleverness is considered a virtue. Go doesn’t work that way. Especially when you’re learning Go out of the box, you should be thinking of writing simple and clear code. That has a lot to do with maintability, stability and the ability of other people to read your code.