I recently switched to working with Go as one of the primary languages that I work with on a day to day basis. I’m still getting the hang of it, but so far I’m enjoying it. I love the set of standard tooling that comes with the language. One of the tools, which is usually not included with a language’s standard toolkit is a performance benchmarking tool. Go includes this in it’s go test tool and can be used to go basic performance testing.

I started out with a mini-project to test this out. You can find a basic set of instructions on how to start off with here. However, there is nothing like a real mini-project to really get the hang of it and appreciate it. Hence, I am going to use the benchmarking tools to test the efficiency of an idea I had to speed up a very trivial thing that us programmers do on a daily basis.

The Project

Problem: What’s the most efficient way to get the first element from a list, which could be empty

Seems pretty simple. There are a couple of ways to do this. I can image many people just thinking that there is only 1, which is the 1st one.

  1. Check of the array is empty or null, if not, then simply return the 0 indexed value.
  2. Iterate over the array, return the value. The loop will only run once, returning the first element.

I liked 2 for it’s simplicity. It also looked like there were less operations going on since we’re skipping the check on the length or the fact that it’s null. However, list comprehension in modern languages also make list traversal very simple, but just cause they look simple, doesn’t mean they’re efficient! Hence, what better thing to test than this!

Setting Up

I am assuming you already have you Go development environment setup, with a workspace which the GOPATH environment variable points to. For details on how to do so, see here.

Let’s first start by creating a package that will contain our code that we will be testing and benchmarking. To mimic what I would be doing in reality, I created an arbitrary struct FullName. Hence, we’ll be working with a list of structs rather than primitive kinds. I’ve also implemented two funcitons on FullName to pull out the first element in the two different ways that we discussed before.

File: $GOPATH/src/github.com/danishm/firstelement/firstelement.go

To test and benchmark this code, we will create another file with sample data, test functions and benchmarking functions. This filename must have _test in it. In this way Go uses convention over config to make things easy and get you up and going quicker.

File: $GOPATH/src/github.com/danishm/firstelement/firstelement_test.go

Testing and Benchmarking

Once we have these two files in place, we can use the go test -bench . -v command to test and benchmark. The command will run the unit tests first to help us confirm our implementation and then the benchmarks. Below is the output of the command

This speaks pretty loudly, the range loop based approach is roughly 12x slower! I was suspecting it be slow, but this was significant. The loop may look classy, but it’s slow. On top of that, it is also known to look weird to a lot of developers.

However, I want to keep on digging in. What if we have a traditional for loop?

Re-Factoring The Loop

Although the range based loop looks nice and clean, I wanted to benchmark using a traditional C/C++ style loop with no iterators. Hopefully, this will perform better. I added the following functions to each of our files.

Add to File: $GOPATH/src/github.com/danishm/firstelement/firstelement.go

Add to File: $GOPATH/src/github.com/danishm/firstelement/firstelement_test.go

Running the benchmark using these new functions yeilded some interesting results

The traditional for loop approach is just as fast as the length check and 0-index approach. This just show how much baggage the range keyword comes with. As you can see here, it provides a consistent way to iterate over an array, slice, string, map, or channel. This makes for readable and maintainable code, but it comes at a cost.

I decided to go deeper!

What If We Used Pointers

I created three new versions of the implementation and test function appended with P, which were the pointer based versions of the original functions. You can see the complete code at this github repository. Here are the results of the run

This yields something interesting as well. The pointer version of BenchmarkGetFirstRangeLoop runs around 1.3x faster than the non-pointer version! I believe this is due to the fact that with pointers, you don’t have to copy the whole data structure when you return it. You’re simply passing back the pointer to it, which probably saved some time.

I hope you enjoyed reading this and are encouraged to benchmark and test your code more diligently.