Two Dumb Ruby Mistakes

April 2nd 2016 — Comments and Reactions

Coding is like climbing: You need equipment
that will catch you when you make a mistake.
(source: Elke Wetzig via Wikimedia Commons)

Most Ruby blog posts show you examples of code you should write: how to solve a certain problem, how to use some library or gem, how Ruby itself works. But today I decided to write about a few dumb mistakes I’ve made using Ruby recently. Read on to see two things you should not do with the Ruby language, for a change.

The depressing thing about this is that I made these dumb mistakes just in the past few weeks! I’ve been using Ruby professionally every day for eight years, I’ve researched and written about Ruby in my spare time as a hobby, and I still make dumb mistakes with the language all the time!

Coding is like climbing: Sooner or later we all make mistakes and fall. What you need to do is plan on this happening and use the appropriate equipment to avoid disaster. Climbers use carabiners, ropes and harnesses to catch them when they fall. Developers should use a language that will catch them when they make dumb mistakes.

Searching For An Array Element

Let’s start with some test data. Here’s an array of Person objects, each with a first name, last name and an insult count:

Person = Struct.new(:first_name, :last_name, :insults)
	candidates = [ 
		Person.new('Ted', 'Cruz', 432),
		Person.new('Donald', 'Trump', 892),
		Person.new('Marco', 'Rubio', 321)
	]

A couple of weeks ago (using different data of course) I wrote this line of code to search for a specific element in the array:

What I wanted was the first person in the array named “Marco.” Instead when I ran the code what I got was the first element of the array, but with the first name set to “Marco:”

p marco
=> #<struct Person first_name="Marco", last_name="Cruz", insults=432>

Of course, I should have known better. The proper line of code is:

I should have used == instead of =. What a dumb mistake. I can’t believe I wrote this code; how embarrassing! I’m sure you all saw the problem right away, and maybe a few of you have made the same mistake before. But let’s walk through what happened when I ran the incorrect code, just to be sure we thoroughly understand the problem.

Ruby started with the candidates array, and called the find method on it:

The find method is actually a member of the Enumerable module, which Ruby includes automatically into the Array class. When find ran, it iterated over the elements of the array and called the block I provided, passing in each element. The first element was the “Ted Cruz” person object:

Now the block executed. And my dumb mistake came into play. What I intended was for the block to return whether or not the first name of the given person was equal to “Marco.” If the first name was “Marco” then Person#first_name == "Marco" would return true, the block would return true and Enumerable#find would return the target person. In this case, "Ted" is not "Marco" so the block would return false.

But my block didn’t check whether the person is named “Marco;” instead, it called the Person#first_name= method, setting the person’s name to “Marco!”

And now, to make matters worse, the block returned the value returned by Person#first_name=, which was the string “Marco,” the new value of the first name attribute. Because Ruby considered “Marco” to be truthy, Enumerable#find returned the first person, even though that person was originally named Ted Cruz. My surrounding code now thinks it found Marco Rubio, but instead has Ted Cruz, renamed to Marco Cruz. What a mess.

Why Didn’t Ruby Tell Me Something Was Wrong?

As a developer, you’re always just one
keystroke away from falling off a cliff.
(source: DecafGrub47393 via Wikimedia Commons)

Think about this for a moment: I used the find method, which called a block and expected that block to return true or false. But my block returned neither true nor false. It returned “Marco.”

Why didn’t Ruby issue some sort of error or warning in this case? Yes, I understand that Ruby considers all values, except for false and nil, to be equivalent to true. In fact, Ruby developers quite often take advantage of this fact to write more concise readable code: We can write if value instead of if value != false or if value != nil.

But in this case, Ruby’s silence allowed my simple coding mistake to become a serious problem. If Ruby had given me some sort of warning or error the first time I ran this code, I would have found the problem and fixed it in 5 seconds. Instead, this code ran for weeks and failed every single time, and I had no idea.

When I fell, Ruby didn’t catch me, it allowed me to fall off the cliff!

Update: Erik Michaels-Ober pointed out today on Twitter that if you always put the variable on the right and the constant on the left, for example like this:

marco = candidates.find { |person| 'Marco' = person.first_name }

…then Ruby will immediately report a syntax error and tell you where the problem was if you ever confuse = with ==. Joshua Ballanco told us that this style of putting the constant before the variable is known as a Yoda condition.

Finding The Maximum Value in an Array

We all have a bad day from time to time. After making that mistake I just continued to work on my project, trying harder not to make any more dumb mistakes. It was my fault, I thought. I just needed to be a better programmer.

But of course, it happened again! I made another dumb Ruby mistake just a few days later. This time I wanted to sort the same array. Specifically, I wanted to find the array element with the maximum value for some attribute. I was using different data, of course, but we can translate the problem to our candidate data set easily.

Suppose I wanted to find the candidate with the maximum number of insults. Easy, right? Here’s the line of code I wrote:

Can you spot the problem here? When I run that code I don’t get Donald Trump, who has the most insults. Instead, I get:

p most_insulting
=> #<struct Person first_name="Marco", last_name="Rubio", insults=321>

Again a simple, dumb mistake. I should have called max_by, instead of max. Here’s the correct code:

Enumerable#max_by does what I thought Enumerable#max would do: It sorts the values returned by the block, and then returns the object corresponding to the maximum value. This is only slightly less embarrassing than my first dumb mistake. Almost all modern programming languages use == and = for equality vs. assignment. There’s no excuse for making that mistake: It was just dumb.

The difference between max and max_by is not quite as obvious. But again, I’ve been using Ruby for 8 years now. I should know better! I’m just a bad Ruby developer. But before we blame this mistake entirely on me, let’s take a closer look at what actually happened when I ran my bad code. Let’s step through what Enumerable#max did, just as we did before with Enumerable#find.

Again Ruby started by calling Enumerable#max on the candidates array:

And again, just like find, max iterates over the array elements. However, instead of passing each person to the block one at a time, it actually passes the array elements in pairs:

Why did Ruby pass two Person objects to my block? Enumerable#max searches for the array element - not the return value of a block - which has the maximum value. It assumes that the values in the array can be compared, that they have a natural sort order. Enumerable#max is perfect for an array of integers or an array of strings. Ruby can sort them automatically and find the maximum value by returning the last element.

Additionally, Ruby allows you to use max when the array elements can’t be sorted automatically, when you have an array of objects, like my Person structures. Because Ruby doesn’t know whether one person is greater or less than another, it allows you to pass a block to max that answers that question. The block should accept two arguments return one of three numeric values: -1, 0 or 1:

-1 if the first value is less than the second (they are in ascending order)
0 indicates they are the same, at least in terms of their sort order, and
1 if the first value is greater than the second (they are in descending order)

So what happened here was that by using Enumerable#max and providing a block, Ruby assumed my block was there to determine the sort order of the Person objects, not to return an attribute for each one.

As you probably know, Ruby makes our lives easier by providing the “space ship” operator, <=>, that compares two values and returns this sort order number: -1, 0 or 1. The correct way to find the most insulting candidate using max would be to compare the two values of Person#insults using <=>:

most_insulting = candidates.max{|person1, person2| person1.insults <=> person2.insults}
p most_insulting
=> #<struct Person first_name="Donald", last_name="Trump", insults=892>

Why Didn’t Ruby Tell Me Something Was Wrong?

I knew all about the space ship operator and sort order blocks, but for whatever reason in the moment I typed in my bad code I just forgot. Maybe I was in a rush, maybe I was just tired. Maybe I really thought I typed max_by but somehow the “_by” part just didn’t leave my brain and make it to the keyboard.

But Ruby knew I should have used max_by, or least that I should have accepted two parameters in my block. Why didn’t it tell me?

That is, my block expected only one argument, not two. I wrote:

{|person| etc…}

and not:

{|person1, person2| etc… }

Why didn’t Ruby complain when it tried to pass two objects, but my block only accepted one? It turns out when you pass extra arguments to a block Ruby silently ignores them. Note: Ruby does check the number of arguments when you explicitly use lambda{} or ->() and then call it using the Proc.call method. But 99% of the time Ruby developers use blocks in the standard, default manner and don’t create Proc objects explicitly.

Ruby could have told me something was wrong by displaying a warning or an error message, maybe: “wrong number of arguments (2 for 1) (ArgumentError).” But instead, it remained silent. It assumed that I just didn’t need that second block argument, that I wanted to keep my code simpler and easier to read, and conveniently allowed me to leave it out of the block’s argument list. Ruby assumed I was a smart, experienced developer who doesn’t make dumb mistakes like this. Ruby was so wrong!

What happened next? Ruby continued to run my block, and things got really ugly. Take another look at the block’s code:

{|person| person.insults}

It returns the insult count for the given person - a number! Next Ruby interpreted the numerical value my block returned, 432, 892 or 321, as the sort order indicator. That’s right: Ruby will accept any positive value from the sort order block, not just 1, and consider that to mean the two objects are in descending order. Similarly, it will take any negative value to mean the values are in ascending order.

Again, Ruby could have told me: “wrong type for block return value (Integer for SortOrder) (TypeError).” But, of course, Ruby isn’t a statically typed language. It doesn't check the types of method and block arguments, or their return values.

Your coding equipment should catch you
when you make a mistake and fall.
(source: Marcin Jahr via Wikimedia Commons)

Once again, Ruby erred on the side of convenience, and assumed I knew what I was doing. It conveniently allowed me to return 321 instead of 1, just in case I really wanted to return 321 without having to convert it to 1.

Our Programming Language Should Catch Our Dumb Mistakes

We actually make dumb mistakes all the time, not just once or twice a week, but probably hundreds of times every day. Every time we misspell a keyword, forget a method argument, or use an API the wrong way we have made a mistake. But we don’t think of these mistakes as mistakes - they are just how we work as humans. When we type, we usually press the backspace key quite often. When we use an API or run shell commands we have to check the documentation or StackOverflow to remind ourselves what arguments or options to use.

And usually our programming language, whether it’s Ruby or something else, finds our mistakes immediately and tells us about them with a syntax error message. We correct the mistake within seconds and continue coding, climbing higher and higher up the cliff. But in my two examples the mistakes, unfortunately, weren’t apparent immediately. This incorrect code ran for weeks before I discovered the problem. You always want to fail fast: The worst mistakes are the ones you never notice until it’s too late.

But why didn’t I discover these mistakes sooner by running tests? Don’t I use TDD? Don’t I at least write tests to check my code after I’ve written it? Yes. But in my actual project, these mistakes were part of my test code. They allowed my tests to pass, but caused them to return a false positive result. My tests were green, but actually weren’t functioning at all. Tests aren't perfect. They are only as good as the code you write to implement them.

Maybe these two dumb Ruby mistakes were exactly that: mistakes Ruby made and not me. I’m only human; it’s normal for me to type in nonsense and garbage all day long into the computer. But Ruby is a programming language. It’s job - it’s most important job - is to tell me when my code is incorrect as soon as possible. In these two examples, it was the Ruby language itself that made the dumb mistake. The bugs weren’t in my code, they were in the language itself.

Of course, I could just switch to a statically typed language, like Java or Go. These languages automatically check the types of arguments and return values for me. If I used Swift I could take advantage of static types and use blocks/closures. I could even use a language like Haskell where the type system is so powerful that merely by allowing my code to run with no errors, the compiler has mathematically proven my code is correct. (If this could only be true!)

But I love Ruby. It’s a joy to use. Ruby code has a very human elegance to it that I haven’t seen in other programming languages. I just wish Ruby would catch me every time I fall.