Refactoring in Ruby: Primitive Obsession
Table of Contents
We’ve all been at this point where we have bloated our classes with primitive values all over the place. Usually, we drop in primitive constants that, for whatever reason, we think that are a good fit to the class. Or sometimes, we just dump primitive values instead of small objects, thinking “it’s okay, it’s just an attribute in the class”. But, does it always make sense?
The problem #
Say we have a project for a finance journalist who wants us to automate his text editor to do some random fixes to his texts, on the fly. The automation is basically adding his name and date to the footer of the text that he’s writing, and to change currencies abbreviations to symbols. The good thing for us is that his editor is written in Ruby, and has the ability to add custom I/O streams to it’s I/O stack.
So, our specification for this I/O stream, is:
- limit text to no more than 10000 words
- append journalist name to end of article
- swap popular currency names abbreviations with symbols
After a bit of work, we come up with this quick hack for a I/O stream:
require 'stringio'
class FinanceIO
class MaximumLengthExceeded < StandardError; end
MAX_WORDS = 10000
MIN_WORDS = 2000
AUTHOR_NAME = "John Smith"
AUTHOR_EMAIL = "[email protected]"
CURRENCIES = [
["USD", "$"],
["GBP", "£"],
["JPY", "¥"],
["CHF", "₣"],
["EUR", "€"],
["INR", "₹"],
["GEN", "¤"] # General currency sign
]
def initialize
@io = StringIO.new
end
def puts(arg)
if @io.string.split("").length >= MAX_WORDS || @io.string.split("").length =< MIN_WORDS
raise TextLengthExceeded.new("The text must be within #{MIN_WORDS} and #{MAX_WORDS} words.")
else
CURRENCIES.each do |abbrv, symbol|
arg.gsub!(abbrv,symbol)
end
@io.puts(arg)
end
end
def on_exit
append_author
end
def method_missing(method_name, *args, &block)
if @io.respond_to?(method_name)
@io.send(method_name, *args)
else
super(method_name, *args, &block)
end
end
private
def append_author
@io.puts("\n")
@io.puts(AUTHOR_NAME)
@io.puts("\n")
@io.puts(AUTHOR_EMAIL)
@io.puts("\n")
end
end
The FinanceIO
class is a wrapper around a StringIO
, which overrides a
couple of methods, but relays any other method calls to the wrapped StringIO
object. This allows us to do our work easily, while not breaking when any
object of the I/O stack tries to call a built-in method on our object.
Let’s break down the example. The MAX_WORDS
and MIN_WORDS
is a limitation
that our client (the journalist) has been given by the newspaper he works for,
or in other words, his articles cannot be loger than 10000 words. The
AUTHOR_NAME
constant is the name of our author. We don’t need to enable
customization for this piece of data, because our client wanted to make our
life easier by agreeing that he will not change his name any time soon. The
AUTHOR_EMAIL
constant is the author public email address. Next, the
CURRENCIES
constant is a two-dimensional array which contains the
abbreviations and the symbols of the most popular currencies that the
journalist uses while writing his finance articles.
Most of the functionality described in the spec is done within the
FinanceIO#puts
method:
def puts(arg)
if @io.string.split("").length >= MAX_WORDS || @io.string.split("").length =< MIN_WORDS
raise TextLengthExceeded.new("The text must be within #{@limitations.min_words} and #{@limitations.max_words} words.")
else
CURRENCIES.each do |abbrv, symbol|
arg.gsub!(abbrv,symbol)
end
@io.puts(arg)
end
end
The puts
method takes a string as an argument, and saves it to the stream.
But, before it does that, it checks if the text is longer than 10000 words. If
it is, it will raise an exception that the text editor will handle gracefully.
If not, it will carry on to substitute any of the currency abbreviations in the
text with the corresponding symbol.
Also, the FinanceIO#on_exit
is a method which will append the author name
in the footer of the text.
def on_exit
append_author
end
This will get automatically invoked by the editor, as an callback before the journalist closes his editor.
This will surely work, and it complies with our imaginary text editor. Sure, it might be a bit dirty, but it does the trick. Now, a question to ourselves - what should we do about the data in the constants? There seems to be much behaviour in connection with these constants. Aside of that, it’s just plain data and it doesn’t really add too much noise to our class and the functionality is easy to follow… right?
The Code Smell #
Well, sure, the fields don’t do much noise, but this is exactly the way of
thinking that creates these code smells. You could argue that the FinanceIO
class is pretty simple to follow, and you would be right. The problem lies in
the practice, which will eventually bloat the class with so much “static” data,
to a point of total confusion. Sure, our journalist could be happy with our
hack, which is great, but he will surely come back one day with more specs for
customizations. That’s what usually happy customers do - they return for more
stuff.
Although this code smell is not that dangerous at first, it will cause havoc in the near future. The Primitive Obsession code smell is one of the simplest ones. It can be spotted in classes where primitives are used to “simulate” types. This means that instead of separating a data type for the primitives, we keep some sort of a primitive value, whether it’s a string, or a number or an array of any values. As a next step, we usually add simple names to these primitive values (see our class above), and we keep on thinking that it is very clear what these values do.
Another occurance of this code smell is creating a “field simulation”. This happens when we add an array of primitive values to the class, which contains some data that is used in the class logic, somewhere.
Sure, three to five values are easy to track. What happens when you get to twenty? Thirty maybe? Yeah, I have seen some classes where these primitive values are all over the place, and let me tell you, it’s no fun!
So, how can we avoid and/or detect this code smell? Let’s see!
Refactoring #
When it comes to refactoring this code smell, you can take multiple paths. For example, if you have a large variety of these primitives, you can always look for logical links between them. Another way is, if these primitive values are used as method parameters, we could introduce a parameter objects, or we could change arrays into objects as well. There are more forms of this code smell, but let’s tackle them one by one and see the approaches to refactoring.
Logical linking #
Like we saw earlier, often these primitive values have some logial link between
them. Although our list of primitives is not that long, you can see the logical
link between AUTHOR_NAME
and AUTHOR_EMAIL
, right?
Since the logical link between these two primitives and the
FinanceIO#append_author
method are easily noticable, we know that these two
constants and the method should be grouped in some way. Here, we can refactor
the values with an object, which will be called Author
:
class Author
attr_reader :first_name, :last_name, :email
def initialize(first_name = "John", last_name = "Smith", email = "[email protected]")
@first_name = first_name
@last_name = last_name
@email = email
end
def full_name
last_name + " " + first_name
end
end
Pretty straight forward. Now, if you think about it, you can notice that the
FinanceIO#append_author
is basically the signature of the journalist. So, we
can extract that method into our new Author
class:
def signature
%Q{
#{full_name}
#{email}
}
end
Now, having the behaviour and the values extracted to a separate class, we can
refactor FinanceIO
a bit:
require 'stringio'
class FinanceIO
class MaximumLengthExceeded < StandardError; end
MAX_WORDS = 10000
MIN_WORDS = 2000
CURRENCIES = [
["USD", "$"],
["GBP", "£"],
["JPY", "¥"],
["CHF", "₣"],
["EUR", "€"],
["INR", "₹"],
["GEN", "¤"] # General currency sign
]
def initialize
@io = StringIO.new
@author = Author.new
end
def puts(arg)
if @io.string.split("").length >= MAX_WORDS || @io.string.split("").length =< MIN_WORDS
raise TextLengthExceeded.new("The text must be within #{MIN_WORDS} and #{MAX_WORDS} words.")
else
CURRENCIES.each do |abbrv, symbol|
arg.gsub!(abbrv,symbol)
end
@io.puts(arg)
end
end
def on_exit
@io.puts(@author.signature)
end
def method_missing(method_name, *args, &block)
if @io.respond_to?(method_name)
@io.send(method_name, *args)
else
super(method_name, *args, &block)
end
end
end
Much cleaner, right? We have the values and the behaviour extracted to a
separate class, and the Author#signature
method clearly shows that the
behaviour revolving the author signature should be a part of the Author
class.
Now, since the first name, last name and email are set to default parameters on
the Author#initialize
method, we could go a step further and extract them to
a DefaultAuthor
subclass:
class DefaultAuthor < Author
def initialize
super("John", "Smith", "[email protected]")
end
end
Having this class, we can easily customize our FinanceIO
stream and use a
non-default author just by tweaking the constructor by using dependency
injection:
class FinanceIO
# snipped..
def initialize(author = DefaultAuthor.new)
@io = StringIO.new
@author = author
end
# snipped..
end
Replacing the array with… something meaningful #
When we got to this step I assume you were wondering what we will do with the
CURRENCIES
two-dimensional array. Well, let’s disect the array a bit and see
what we are actually dealing with.
Well, just like the array name states, we are dealing with a bunch of
currencies here, ergo, we need a Currency
class to encapsulate the
abbreviation and the symbol:
class Currency
attr_reader :abbreviation, :symbol
def initialize(abbreviation, symbol)
@abbreviation = abbreviation
@symbol = symbol
end
end
So, how can we eliminate the array from the FinanceIO
class and move it
somewhere else? Let’s start moving the array step by step - first, by
extracting it into the Currency
class. We will move the collection and we
will also add a utility method Currency.all
which will return a collection of
all the currencies, as Currency
objects:
class Currency
DEFAULT_CURRENCIES = [
["USD", "$"],
["GBP", "£"],
["JPY", "¥"],
["CHF", "₣"],
["EUR", "€"],
["INR", "₹"],
["GEN", "¤"]
]
def self.all(currencies_list = DEFAULT_CURRENCIES)
currencies_list.collect do |abbr, sym|
new(abbr, sym)
end
end
attr_reader :abbreviation, :symbol
def initialize(abbreviation, symbol)
@abbreviation = abbreviation
@symbol = symbol
end
end
Now, refactoring the FinanceIO#puts
method should be straight forward:
def puts(arg)
if @io.string.split("").length >= MAX_WORDS || @io.string.split("").length =< MIN_WORDS
return MaximumLengthExceeded.new("You've reached the maximum length.")
else
Currency.all.each do |currency|
arg.gsub!(currency.abbreviation, currency.symbol)
end
@io.puts(arg)
end
end
Sure, this could work, but we still have a bit of polution in the Currency
class. What we could do, instead of keeping the currencies as a constant, we
could move them to a method, or better yet, move them to a new class, with a
method that would return all of them:
class DefaultCurrencies
def self.all
[
["USD", "$"],
["GBP", "£"],
["JPY", "¥"],
["CHF", "₣"],
["EUR", "€"],
["INR", "₹"],
["GEN", "¤"]
]
end
end
Then, adding this class as a depenency to the Currency
class will be pretty
easy. But also, uncoupling it will be also easy, if we decide to move these
hardcoded primitive values to a database:
class Currency
def self.all
DefaultCurrencies.all.collect do |abbr, sym|
new(abbr, sym)
end
end
attr_reader :abbreviation, :symbol
def initialize(abbreviation, symbol)
@abbreviation = abbreviation
@symbol = symbol
end
end
With this step we have encapsulated our collection of data, and made it easy to
replace in next iteration of our program. Our Currency
class only depends on
a class method, which would return a two-dimensional array with the currencies
data. With this in place, we can come back to the FinanceIO
class and polish
it up.
Introducing a parameter object #
Let’s look at the state of our FinanceIO
class now. The only leftovers from
our refactoring are the MAX_WORDS
and MIN_WORDS
primitives. Their purpose
is to limit the length of the text that the author can write. Let’s see how we
can refactor it out of the class.
If you look at the FinanceIO#puts
method, you can notice that these constants
are used only as a parameter, to check if the length words are within the
limit.
Having this in mind, we can extract a Parameter Object. Parameter objects
are usually like containers for data that has same or similar role. Our
limitation primitives are a nice example - they do not belong to the
FinanceIO
class, but their only purpose is to validate the input of the user.
Let’s extract them out to a separate class:
class EditorialLimitations
attr_reader :max_words, :min_words
def initialize
@max_words = 10000
@min_words = 2000
end
def within?(text)
text_length = words(text)
text_length >= @min_words && text_length =< @max_words
end
private
def words(txt)
txt.split("").length
end
end
As you can see, by extracting this parameter object, we are even able to extract the logic which will validate the text of the journalist. Now, let’s use our newly created class in the program:
require 'stringio'
class FinanceIO
class MaximumLengthExceeded < StandardError; end
def initialize
@io = StringIO.new
@author = Author.new
@limitations = EditorialLimitations.new
end
def puts(arg)
if @limitations.within?(io.string)
Currency.all.each do |currency|
arg.gsub!(currency.abbreviation, currency.symbol)
end
else
raise TextLengthExceeded.new("The text must be within #{@limitations.min_words} and #{@limitations.max_words} words.")
end
@io.puts(arg)
end
def on_exit
@io.puts(@author.signature)
end
def method_missing(method_name, *args, &block)
if @io.respond_to?(method_name)
@io.send(method_name, *args)
else
super(method_name, *args, &block)
end
end
end
As you can see, although there’s still a lot going on in the FinanceIO#puts
method, it’s much easier to understand what’s going on. Also, if we would write
tests around this code, testing would be much easier.
Outro #
We could do more refactoring here, especially in the FinanceIO#puts
method,
but for the scope of this article we will stop there. I encourage you to carry
on with refactoring the code, and writing some tests - practice makes it
perfect, doesn’t it?
Until next time!