I recently switched to working with Go as one of the primary languages that I work with on a day to day basis. I’m still getting the hang of it, but so far I’m enjoying it. I love the set of standard tooling that comes with the language. One of the tools, which is usually not included with a language’s standard toolkit is a performance benchmarking tool. Go includes this in it’s go test
tool and can be used to go basic performance testing.
I started out with a mini-project to test this out. You can find a basic set of instructions on how to start off with here. However, there is nothing like a real mini-project to really get the hang of it and appreciate it. Hence, I am going to use the benchmarking tools to test the efficiency of an idea I had to speed up a very trivial thing that us programmers do on a daily basis.
The Project
Problem: What’s the most efficient way to get the first element from a list, which could be empty
Seems pretty simple. There are a couple of ways to do this. I can image many people just thinking that there is only 1, which is the 1st one.
- Check of the array is empty or null, if not, then simply return the 0 indexed value.
- Iterate over the array, return the value. The loop will only run once, returning the first element.
I liked 2 for it’s simplicity. It also looked like there were less operations going on since we’re skipping the check on the length or the fact that it’s null. However, list comprehension in modern languages also make list traversal very simple, but just cause they look simple, doesn’t mean they’re efficient! Hence, what better thing to test than this!
Setting Up
I am assuming you already have you Go development environment setup, with a workspace which the GOPATH
environment variable points to. For details on how to do so, see here.
Let’s first start by creating a package that will contain our code that we will be testing and benchmarking. To mimic what I would be doing in reality, I created an arbitrary struct FullName
. Hence, we’ll be working with a list of structs rather than primitive kinds. I’ve also implemented two funcitons on FullName
to pull out the first element in the two different ways that we discussed before.
File: $GOPATH/src/github.com/danishm/firstelement/firstelement.go
To test and benchmark this code, we will create another file with sample data, test functions and benchmarking functions. This filename must have _test
in it. In this way Go uses convention over config to make things easy and get you up and going quicker.
File: $GOPATH/src/github.com/danishm/firstelement/firstelement_test.go
Testing and Benchmarking
Once we have these two files in place, we can use the go test -bench . -v
command to test and benchmark. The command will run the unit tests first to help us confirm our implementation and then the benchmarks. Below is the output of the command
This speaks pretty loudly, the range
loop based approach is roughly 12x slower! I was suspecting it be slow, but this was significant. The loop may look classy, but it’s slow. On top of that, it is also known to look weird to a lot of developers.
However, I want to keep on digging in. What if we have a traditional for
loop?
Re-Factoring The Loop
Although the range
based loop looks nice and clean, I wanted to benchmark using a traditional C/C++ style loop with no iterators. Hopefully, this will perform better. I added the following functions to each of our files.
Add to File: $GOPATH/src/github.com/danishm/firstelement/firstelement.go
Add to File: $GOPATH/src/github.com/danishm/firstelement/firstelement_test.go
Running the benchmark using these new functions yeilded some interesting results
The traditional for
loop approach is just as fast as the length check and 0-index approach. This just show how much baggage the range
keyword comes with. As you can see here, it provides a consistent way to iterate over an array, slice, string, map, or channel. This makes for readable and maintainable code, but it comes at a cost.
I decided to go deeper!
What If We Used Pointers
I created three new versions of the implementation and test function appended with P
, which were the pointer based versions of the original functions. You can see the complete code at this github repository. Here are the results of the run
This yields something interesting as well. The pointer version of BenchmarkGetFirstRangeLoop
runs around 1.3x faster than the non-pointer version! I believe this is due to the fact that with pointers, you don’t have to copy the whole data structure when you return it. You’re simply passing back the pointer to it, which probably saved some time.
I hope you enjoyed reading this and are encouraged to benchmark and test your code more diligently.