We’ve all been at this point where we have bloated our classes with primitive values all over the place. Usually, we drop in primitive constants that, for whatever reason, we think that are a good fit to the class. Or sometimes, we just dump primitive values instead of small objects, thinking “it’s okay, it’s just an attribute in the class”. But, does it always make sense?
Say we have a project for a finance journalist who wants us to automate his text editor to do some random fixes to his texts, on the fly. The automation is basically adding his name and date to the footer of the text that he’s writing, and to change currencies abbreviations to symbols. The good thing for us is that his editor is written in Ruby, and has the ability to add custom I/O streams to it’s I/O stack.
So, our specification for this I/O stream, is:
- limit text to no more than 10000 words
- append journalist name to end of article
- swap popular currency names abbreviations with symbols
After a bit of work, we come up with this quick hack for a I/O stream:
FinanceIO class is a wrapper around a
StringIO, which overrides a
couple of methods, but relays any other method calls to the wrapped
object. This allows us to do our work easily, while not breaking when any
object of the I/O stack tries to call a built-in method on our object.
Let’s break down the example. The
MIN_WORDS is a limitation
that our client (the journalist) has been given by the newspaper he works for,
or in other words, his articles cannot be loger than 10000 words. The
AUTHOR_NAME constant is the name of our author. We don’t need to enable
customization for this piece of data, because our client wanted to make our
life easier by agreeing that he will not change his name any time soon. The
AUTHOR_EMAIL constant is the author public email address. Next, the
CURRENCIES constant is a two-dimensional array which contains the
abbreviations and the symbols of the most popular currencies that the
journalist uses while writing his finance articles.
Most of the functionality described in the spec is done within the
puts method takes a string as an argument, and saves it to the stream.
But, before it does that, it checks if the text is longer than 10000 words. If
it is, it will raise an exception that the text editor will handle gracefully.
If not, it will carry on to substitute any of the currency abbreviations in the
text with the corresponding symbol.
FinanceIO#on_exit is a method which will append the author name
in the footer of the text.
This will get automatically invoked by the editor, as an callback before the journalist closes his editor.
This will surely work, and it complies with our imaginary text editor. Sure, it might be a bit dirty, but it does the trick. Now, a question to ourselves - what should we do about the data in the constants? There seems to be much behaviour in connection with these constants. Aside of that, it’s just plain data and it doesn’t really add too much noise to our class and the functionality is easy to follow… right?
The Code Smell
Well, sure, the fields don’t do much noise, but this is exactly the way of
thinking that creates these code smells. You could argue that the
class is pretty simple to follow, and you would be right. The problem lies in
the practice, which will eventually bloat the class with so much “static” data,
to a point of total confusion. Sure, our journalist could be happy with our
hack, which is great, but he will surely come back one day with more specs for
customizations. That’s what usually happy customers do - they return for more
Although this code smell is not that dangerous at first, it will cause havoc in the near future. The Primitive Obsession code smell is one of the simplest ones. It can be spotted in classes where primitives are used to “simulate” types. This means that instead of separating a data type for the primitives, we keep some sort of a primitive value, whether it’s a string, or a number or an array of any values. As a next step, we usually add simple names to these primitive values (see our class above), and we keep on thinking that it is very clear what these values do.
Another occurance of this code smell is creating a “field simulation”. This happens when we add an array of primitive values to the class, which contains some data that is used in the class logic, somewhere.
Sure, three to five values are easy to track. What happens when you get to twenty? Thirty maybe? Yeah, I have seen some classes where these primitive values are all over the place, and let me tell you, it’s no fun!
So, how can we avoid and/or detect this code smell? Let’s see!
When it comes to refactoring this code smell, you can take multiple paths. For example, if you have a large variety of these primitives, you can always look for logical links between them. Another way is, if these primitive values are used as method parameters, we could introduce a parameter objects, or we could change arrays into objects as well. There are more forms of this code smell, but let’s tackle them one by one and see the approaches to refactoring.
Like we saw earlier, often these primitive values have some logial link between
them. Although our list of primitives is not that long, you can see the logical
Since the logical link between these two primitives and the
FinanceIO#append_author method are easily noticable, we know that these two
constants and the method should be grouped in some way. Here, we can refactor
the values with an object, which will be called
Pretty straight forward. Now, if you think about it, you can notice that the
FinanceIO#append_author is basically the signature of the journalist. So, we
can extract that method into our new
Now, having the behaviour and the values extracted to a separate class, we can
FinanceIO a bit:
Much cleaner, right? We have the values and the behaviour extracted to a
separate class, and the
Author#signature method clearly shows that the
behaviour revolving the author signature should be a part of the
Now, since the first name, last name and email are set to default parameters on
Author#initialize method, we could go a step further and extract them to
Having this class, we can easily customize our
FinanceIO stream and use a
non-default author just by tweaking the constructor by using dependency
Replacing the array with… something meaningful
When we got to this step I assume you were wondering what we will do with the
CURRENCIES two-dimensional array. Well, let’s disect the array a bit and see
what we are actually dealing with.
Well, just like the array name states, we are dealing with a bunch of
currencies here, ergo, we need a
Currency class to encapsulate the
abbreviation and the symbol:
So, how can we eliminate the array from the
FinanceIO class and move it
somewhere else? Let’s start moving the array step by step - first, by
extracting it into the
Currency class. We will move the collection and we
will also add a utility method
Currency.all which will return a collection of
all the currencies, as
Now, refactoring the
FinanceIO#puts method should be straight forward:
Sure, this could work, but we still have a bit of polution in the
class. What we could do, instead of keeping the currencies as a constant, we
could move them to a method, or better yet, move them to a new class, with a
method that would return all of them:
Then, adding this class as a depenency to the
Currency class will be pretty
easy. But also, uncoupling it will be also easy, if we decide to move these
hardcoded primitive values to a database:
With this step we have encapsulated our collection of data, and made it easy to
replace in next iteration of our program. Our
Currency class only depends on
a class method, which would return a two-dimensional array with the currencies
data. With this in place, we can come back to the
FinanceIO class and polish
Introducing a parameter object
Let’s look at the state of our
FinanceIO class now. The only leftovers from
our refactoring are the
MIN_WORDS primitives. Their purpose
is to limit the length of the text that the author can write. Let’s see how we
can refactor it out of the class.
If you look at the
FinanceIO#puts method, you can notice that these constants
are used only as a parameter, to check if the length words are within the
Having this in mind, we can extract a Parameter Object. Parameter objects
are usually like containers for data that has same or similar role. Our
limitation primitives are a nice example - they do not belong to the
FinanceIO class, but their only purpose is to validate the input of the user.
Let’s extract them out to a separate class:
As you can see, by extracting this parameter object, we are even able to extract the logic which will validate the text of the journalist. Now, let’s use our newly created class in the program:
As you can see, although there’s still a lot going on in the
method, it’s much easier to understand what’s going on. Also, if we would write
tests around this code, testing would be much easier.
We could do more refactoring here, especially in the
but for the scope of this article we will stop there. I encourage you to carry
on with refactoring the code, and writing some tests - practice makes it
perfect, doesn’t it?
Until next time!