The new code formatter for Erlang: rebar3 format

Written by Brujo Benavides, February 25, 2020

In recent years, many language ecosystems have developed automatic code formatters to reduce the mental overhead of code readers and therefore to share code more easily. These tools work by ensuring that all code written in the same language looks the same. Some examples of these tools include gofmt for Go or mix format for Elixir. The Erlang community was lacking a tool like this, so we created a rebar3 plugin just to automatically format code.

In this article we’ll discuss the history of the Erlang parsing and formatting tools, the challenges of developing a formatter and the resulting tool that we created. Learn how you can use it and customize it to your needs.

10-15 minute read

Bugs Bunny - The Rabbit of Seville.

Introduction

NextRoll’s RTB Team devotes quite a bit of our efforts to make our codebase as mature and maintainable as we can. For our Erlang code, we started working on this task years ago. We trim dead code using xref. We remove discrepancies with the help of dialyzer. We make sure our code is well behaved using PEST. We let Elvis find stylistic anomalies…

But there was one tool that was missing: a code-formatter. We were using a code formatter for Go, Elixir, and Python. But there weren’t any (or barely any) for Erlang. So, we decided to use one of our HackWeeks to create one.

A Bit of History

Other Code Formatters

Code formatting is certainly nothing new, it’s been around for ages with several very interesting papers written about it. But recently, there’s been a tendency in all modern languages to include one (and only one) formatter, like gofmt and, of course, the one that influenced us the most: mix format.

We’re by no means experts in this area and therefore we wanted to rely on existing efforts as much as we could. We took inspiration from all of them, but we tried to use as many already written components as possible, and (as usual) there are a bunch of them already baked into OTP…

Parsing and Formatting Erlang Code

Our first initiative was to try to find existing tools to format Erlang code.

We found a ready-to-use solution: rebar3 fmt. The problem is that, as it clearly states in its description, it requires emacs, which is something most of us don’t use. But it pointed us to what is generally recognized as the de facto standard for Erlang formatters: erlang-mode for Emacs. That’s what the OTP Team considers the standard way of formatting Erlang code and what the other tools included in OTP are loosely based on.

What are these tools you say? That was our follow up question as well! And these are the ones we found:

erl_tidy: The closest thing to an automatic code formatter for Erlang. It uses many of the other modules in this list to parse and rewrite Erlang code. It’s a bit old and it has a bunch of well-known deficiencies, including but not limited to its lack of proper support for macros and comments in code.
erl_prettypr: This is a pretty printer – it takes an AST (Abstract Syntax Tree) as an input and it prints it out in a pretty way. It’s the standard Erlang pretty printer. Our original intention was to use it, but its extensibility support is complex and poorly documented at best. So, like many others before us (e.g. wrangler, erlang-ls, etc.), we just copied it into our project and started from there.
epp: The other side of the story. This is the Erlang parser that comes with OTP. It also has some limitations, most importantly when it comes to macros and comments, since it’s intended to be used primarily by the compiler.
epp_dodger: A module explicitly created to work like epp but bypassing macros and preprocessor directives. It’s also a bit buggy and it has limited support for extensibility. Luckily, Juan Facorro already copied it and improved it in katana-code.

So, what we ended up using is ktn_dodger (from katana-code) to parse the code and turn it into an AST and then our version(s) of erl_prettypr (more on this below) to output the formatted code.

Choosing the Right Format

Once we had the tools, we developed our very first version of the formatter (you can still find it on hex.pm as 0.0.1). That version simply passed the provided Erlang code through the tools and generated the code formatted with erl_prettypr.

For a while, we considered that to be the canonical formatting since, you know, it comes with OTP, right? But we (and many others like us) didn’t want our code formatted strictly as erl_prettypr outputs it.

So, we started adding configuration options to be able to adjust the formatter to our tastes. After a while, though, we realized that what we wanted was not an extension of erl_prettypr, it was a different formatter. That’s when we moved from one formatter with lots of options, to a formatting behavior with multiple implementations.

Our favorite way of formatting Erlang code is now encoded in the default_formatter, but we kept the OTP-approved way alive as the otp_formatter. And now you can define your own, too.

We’re not Alone

While doing our research, we found that we’re not the only ones who saw the need for an Erlang formatter. As a matter of fact, several people are working on different code formatting tools for Erlang these days:

As we mentioned before, if you’re an emacs user you already have a rebar3 plugin that you can use: rebar3 fmt.
A while back, Pierre Krafft tried to just improve erl_tidy and wrote a PR for that.
In that thread, we learned that Michał Muskała was also working on a formatter, probably called erlfmt.
Later on, we found steamroller, by Daniel Tipping.

Both steamroller and erlfmt are much more opinionated than our formatter, mostly because their authors aim at having consistent formatting across all Erlang codebases in the world, much like the goal of mix format.

We see that as a great goal, but we have a fairly smaller one: What we want is consistent formatting within all Erlang codebases in the world. In other words: We want all modules in each project to be consistently formatted, even if they don’t share the same formatting rules with other projects.

After all, at this point it’s hard to even define the canonical formatting for all Erlang code in the wild. And if we achieve our goal and some developers format their code using rebar3 format and you (another developer) can’t read their code because it is using a different formatter than the one you’re used to (e.g. they use a comma-first ROK-style formatter), all you need to do is switch an option in the rebar.config, run rebar3 format and magically… that ugly code now looks your way.

But enough with the history, let’s see what you can do with this tool.

How to Use `rebar3 format`

Quick Start

Just add this to your rebar.config (either in your project or globally in ~/.config/rebar3/rebar.config):

{plugins, [rebar3_format]}.

Then run

rebar3 format

and enjoy.

Configuration

If you don’t really like the default formatting as-is, rebar3 format can be configured using the format section of the rebar.config. There are three main options you can specify.

What to Format

To determine what files the formatter should format, you use the files parameter:

{format, [{files, ["src/*.erl", "test/*.erl"]}]}.

What Formatter to Use

Unless you specify otherwise, rebar3 format will always use the default formatter that’s baked into it. But if you want you can use the otp_formatter or your one, like this:

{format, [
    {files, ["src/*.erl", "test/*.erl"]},
    {formatter, otp_formatter}
]}.

How to Configure the Formatter

Finally, you can also set up individual options for the formatter you want to use. For instance, for the otp_formatter you can change paper (i.e. the expected max width of the formatted code):

{format, [
    {files, ["src/*.erl", "test/*.erl"]},
    {formatter, otp_formatter},
    {options, #{paper => 150}}
]}.

To find out the options for each provider, check out the docs that are available online.

A Proposed Workflow

Drawing from the Smalltalk formatter experience that I had (where the code was formatted only when presenting it to the developer but not when stored in the image itself), I want to propose a workflow for teams where each member has its preferred style for code formatting. The idea is to take advantage of rebar3 profiles and write the following on your rebar.config file:

%% The canonical format used when pushing code to the central repository
{format, [
    {files, ["src/*.erl", "include/*.hrl", "test/*.erl"]},
    {formatter, default_formatter},
    {options, #{paper => 100}}
]}.
{profiles, [
    {brujo, [
        {format, [
            {files, ["src/*.erl", "include/*.hrl", "test/*.erl"]},
            {formatter, rok_formatter}, % I prefer comma-first formatting
            {options, #{paper => 100}}
        ]}
    ]},
    {miriam, [
        {format, [
            {files, ["src/*.erl", "include/*.hrl", "test/*.erl"]},
            {formatter, default_formatter},
            {options, #{
                inline_clause_bodies => false, % she doesn't like one-liners
                inline_items => all % but she doesn't like long lists of items
            }}
        ]}
    ]}
]}

Then whenever you’re about to work on something, follow this ritual:

git checkout master
git checkout -b my-branch
rebar3 as brujo format
# I work on my code normally
# Run tests and what-not
# Until I'm ready to commit
rebar3 format
git commit -am "Apply my changes"
git push origin my-branch --set-upstream

Miriam does the same but using as miriam instead of as brujo.

That way each one of us can read code in the way we understand it better, write code exactly how we like to write it, etc. Then publish it in a consistent way that matches the style of the rest of the project.

Examples

If you want to see what the formatter can do to your code, the best place to go is the sample project on the repo itself. All sorts of examples are there with as many edge cases as we could create and find. If you know of others, please contribute by adding them there or writing issues so we can add them.

Even though we’re still in the process of testing and improving our tool, we already started using it for several of our repositories. You can see the formatter in action in spillway and mero.

As the best code reviewers out there may notice, those PRs required some manual adjustments after processing the code with the formatter. We consider that a feature: That’s the formatter allowing us to spot very ugly pieces of code (e.g., too deeply nested structures) that we should refactor.

To be clear: the formatter didn’t break our code (it has built-in verification for that), it just made it look extremely ugly, therefore prompting us to beautify it.

Bugs Bunny - The Rabbit of Seville.

What Now?

We’re releasing the formatter as early as we can to catch as many bugs and nuances as possible. Please try it in your code and report any bugs and new ideas you have here.

We plan to keep using this ourselves, but if anybody feels like making this tool official (if you’re a member of the OTP Team, this message is 100% for you 🙄), that would be amazing.

Appendix A: Beautiful Code

As a bonus track, I wanted to know how does my favorite piece of code look like when formatted by rebar3 format. Let’s see…

-module(in).

-author(john).
-author(paul).
-author(george).
-author(ringo).

-export([my_life/1]).

my_life(NewPlaces) ->
    Places = db:get_all(places),
    UpdatedPlaces = [NewPlace
                     || NewPlace <- NewPlaces,
                        lists:member(NewPlace, Places)],
    lists:foreach(fun (Place) ->
                          db:insert(places, Place)
                  end,
                  UpdatedPlaces),
    DeletedPlaces = [Place
                     || Place <- Places,
                        not lists:member(Place, NewPlaces)],
    db:delete(places, DeletedPlaces),
    Moments = [Moment
               || Place <- Places,
                  Moment <- places:moments(Place)],
    People = [Person
              || Moment <- Moments,
                 Person
                     <- moments:lovers(Moment) ++
                          moments:friends(Moment)],
    {Dead, Living} = lists:partition(fun person:is_dead/1,
                                     People),
    lists:foreach(fun person:love/1, Dead ++ Living),

    You = db:get_first(people),
    [] = [Person
          || Person <- People,
             person:comparable(Person, You)],
    ok = love:update(),
    UpdatedMemories = [moments:meaning(Moment, null)
                       || Moment <- Moments],
    db:update(moments, UpdatedMemories),

    my_life(You, People, UpdatedMemories).

my_life(You, People, Things) ->
    case rand:uniform(5) of
      1 ->
          timer:sleep(rand:uniform(100) + 100),
          person:think_about(People);
      2 ->
          timer:sleep(rand:uniform(100) + 100),
          moments:think_about(Things);
      _ ->
          dont_stop_now
    end,
    person:love(You),
    my_life(You, People, Things).

Not bad, huh?

Bugs Bunny - The Rabbit of Seville.

Introduction

A Bit of History

Other Code Formatters

Parsing and Formatting Erlang Code

Choosing the Right Format

We’re not Alone

How to Use rebar3 format

Quick Start

Configuration

What to Format

What Formatter to Use

How to Configure the Formatter

A Proposed Workflow

Examples

What Now?

Appendix A: Beautiful Code

Let's get things rolling.

How to Use `rebar3 format`