Spot The Discrepancies with Dialyzer for Erlang
Dialyzer is a great tool to validate Erlang code, but it might slow down your development process if devs are applying it to huge codebases constantly. Particularly if that code was never analyzed with it.
This article is our answer to the big question: How to start using dialyzer in a huge project where it was never applied before?
10-15 minute read
Continuing with our series of articles about the usage of Erlang/OTP to build our real-time bidding platform, we would like to show you now how we added Dialyzer to our CI pipelines.
The main Erlang application for our Real-Time Bidding servers was created way before rebar3 existed. Performing the task equivalent to rebar3 dialyzer
was not easy and it was also very time consuming.
We recently started using rebar3
to manage our projects and, as part of that process, we decided to finally deal with that bit of technical debt.
Of course, it was not as easy as just run dialyzer on the code and remove all the warnings. When we started, we had approximately 1800 warnings to deal with, between our main repo and its dependencies. So, we tackled them incrementally. Let me walk you through our process…
Introduction
Before we begin, let’s talk a little bit more about Dialyzer and why it’s so important to use it.
What is Dialyzer?
Dialyzer is a discrepancy analyzer for Erlang/Elixir code. It checks your applications to find discrepancies such as definite type errors, code that has become dead or unreachable because of programming error, and unnecessary tests, among other things.
How to run Dialyzer?
Dialyzer can be run from command line (it’s one of the several tools that are shipped with Erlang/OTP) but these days it’s far more common to run it with rebar3, i.e. rebar3 dialyzer
.
Why you should use Dialyzer?
Dialyzer will not warn you about all your errors, but if dialyzer emits a warning about something in your code, you can be sure that there is a bug there. You’ll see more on that bug finding stuff in the paragraphs below.
And now… let’s go back to our story!
Our Goal
To be able to consistently reduce the number of discrepancies over time without altering our development speed.
Metrics
With that goal in mind, we established a plan of attack and, since you can’t improve what you can’t measure, our first step was to instrument the number of dialyzer warnings so we could keep an eye on it and, hopefully, watch it go down to zero eventually.
We use Datadog for our real-time metrics. In what may be considered a severe misuse of this tool, we decided to write a simple bash script using the datadog agent to report the number of warnings found by dialyzer. The idea was to provide a nice way to visualize our progress relative to our goal stated above. But an important question popped up: how do we find the number of warnings?
The Warnings File
Luckily for us rebar3 dialyzer
generates a file with the list of warnings, called _build/${REBAR_PROFILE}/${OTP-VERSION}.dialyzer_warnings
that looks like this:
So, this is what our instrumentation script looks like:
Let’s read that backwards: the script reports the number of dialyzer discrepancies by echoing the contents of the environment variable WARNINGS
to the datadog agent listening on port 8125 on our machine. The datadog agent in turn ships it over to datadog servers. The contents of the WARNING
variable are simply the name of the metric (code_quality.dialyzer.discrepancies
), following by the count. The count is determined by reading the .dialyzer_warnings
file, using sed
to remove empty lines, piping that to wc to count lines and cleaning up wc’s output of spaces with tr
.
The Number of Warnings
We added that to our Makefile target for dialyzer, as follows:
But then, when we started watching that metric, we noticed something odd…
The numbers seemed right in general, but what about those odd spikes every now and then? Turns out, they were generated when running dialyzer for the first time (i.e. on a clean clone of the project). That’s because, when rebar3 dialyzer
runs for the first time for a project, it generates the persistent lookup table (PLT) including the project dependencies and those dependencies generate some dialyzer warnings of their own. Those warnings are not generated again once you already have a plt, so the get the actual number of discrepancies we want for our main project, we need to run rebar3 dialyzer
once the PLT is already generated. That’s why our Makefile target actually looks like the following.
Organizing the Work
Once we had that metric in place, we now wanted to see it go down and to achieve that, our first idea was to write tickets to fix the warnings. But removing all 1800 warnings together was, of course, impossible. Can you imagine reviewing such a massive pull request? That’s why, using the dialyzer_warnings
file again, we decided to turn it into tickets. We use JIRA to organize our work, and it comes with a handy CLI that we used to write an escript that groups warnings and creates a ticket for each module. I won’t paste the whole script here, but the meaty part is this:
Keeping Discrepancies at Bay
Of course, we didn’t stop the development process to put everyone to work on the tickets generated with that script.
Well, actually…
HackWeek!
One of the things that all rollers enjoy working here are unquestionably the HackWeeks. And during the last one of 2018, a team of 4 developers decided to remove as many dialyzer warnings as possible from our code.
And removing warnings… we did! From the original 1800 we went down to a whooping 300 without affecting performance nor functionality in the slightest!!
Oh, and in that process, we eliminated tons of dead code and fixed 11 bugs that were still undetected by our tests!!
But after that week, we wanted to ensure that we didn’t start adding new discrepancies as we moved on.
CI Additions
We didn’t want to require our developers to run dialyzer each time (although we strongly recommended it) since our original goal explicitely included not altering our development speed. That’s why we added dialyzer to our CI instead!
But, of course, we still had 300 warnings. We couldn’t require a clean run of dialyzer for each pull request. What we decided instead was to reject PRs if they included new dialyzer warnigns.
We use Buildkite for CI, so (using *.dialyzer_warnings
once more) we added a new pipeline that generates a normalized warnings list each time anything is pushed to master
:
And that normalize_warnings.sh
script? It looks like this:
Basically, three steps:
- Remove the current folder from the paths in the file
- Remove empty lines
- Sort them
Once we had that in place, we only needed to update our existing PR verification pipeline to generate, normalize and compare its list of warnings with the one from master
. We did that by extending our already existing make dialyzer
as follows and verifying that new.dialyzer_warnings
is empty on Buildkite.
Caveat
Keen-eyed folks may notice that since we’re comparing files with line-numbers in them, we’re not exactly detecting just new warnings. If your changes alter the line numbers of lines with existing warnings, they will be reported as new ones. When we noticed that we decided that it was fair. The rule was: If you modify a module with dialyzer warnings, you should fix them as part of your PR, too.
Caching PLTs
Now that we had added dialyzer to our CI we had not slowed down our development time when working on our computers but we had still slowed down our CI times considerably. So, to regain that lost speed, we needed to avoid recompiling the PLTs in each run. Luckily for us, rebar3
places the PLTs in a very convenient and configurable location. So, we adjusted our buildkite pipelines like you can see below…
We basically synched our plts with s3 each time. And that was it! CI runs fast as usual, detects new warnings and we all keep improving our code constantly until we eventually get to 0 warnings.
Look, this is our progress last week…
We went from 300 to 225 in just a week! And the most important part is that we effortlessly included dialyzer as part of our development process forever, thus increasing the quality of our code significantly.
I hope this story inspires you to do the same in your project and be as happy and as proud of your code as we are of ours :)
Do you enjoy building high-quality large-scale systems? Roll with Us!