Easy (and slightly crazy) way of writing bash scripts

Written by Timothée Boucher, August 24, 2015

I’ve always been interested in creating and improving our developer tools. Things like git hooks, browser extensions or command line scripts.

When I work on such tools, one of my goals is to make them easy but also pleasant to use. “Pleasantness” typically comes in the form of being the least verbose necessary, displaying nice and meaningful colors or making informed assumptions.

In this post, I talk about a bash “command runner”. It allows me to easily write informative and complicated scripts in a way that satisfies my “pleasantness” requirement.

To keep you reading, here it is in action:

command_runner in action

The problem

I recently worked on streamlining some steps of our internationalization (a.k.a “i18n”) process.

For the most part, our process is as described by LingoHub in this post. I.e. we have a branch named i18n that starts as a copy of master. All feature branches are merged into i18n when they are ready to be merged into master. That lets us get translations started for all the feature branches being reviewed.

This process has a few recurring tasks for developers, like merging our current branch into the i18n branch or pushing translation files to an external API. The two common themes with these tasks are that:

They have a significant number of steps.
They involve git operations: checking a branch out, pulling from the remote, merging, etc.

I’ll use the push example: our new strings are ready to be sent to translators, so we need to compile our messages into a POT file and push it to Smartling, the online software we use to manage work with our translators.

For that one task we need to run the following commands:

# Starting from any branch
# In case something has been changed:
git stash
git checkout i18n
git pull origin i18n
# Extract messages from codebase using PyBabel
python setup.py extract_messages
# Pushes to our translation SaaS (using separate command line tool)
smartling push
# Commit the whole thing
git add adroll/dotcom/i18n/adroll.pot
git commit -m 'Update POT file'
git push origin i18n
# Get back to previous state
git checkout -
git stash pop # maybe?

Obviously, this is too many steps to type by hand every time. We’re programmers, let’s automate this!

The naive solution

The first option would be to write a naive script: take all the above commands in a file, append #!/usr/env bash -e, chmod +x and you’re good to go!

Well… until you realize that you first need to check that you did stash something, otherwise you’d be popping an unrelated stash. You also need to be confident that all the git commands will succeed. In reality, they can each fail in many different ways. For example, maybe your local i18n branch is ahead of your origin.

With the -e flag, bash will stop at the first problem, but then you need to pick up where you left off and that naive script won’t let you do that.

The (trying-to-be-)smarter solution

A second option then is to write a smarter script: any time there’s ambiguity, you check the exit code and keep track of the state of the commands. Something like this:

#!/usr/env bash -e
git stash # this shoud be safe
git checkout i18n # this one too
git pull origin i18n # hmm, that could fail for a few reasons...

if [ $? = 0 ]; then
    python setup.py extract_messages
    if [ $? = 0 ]; then
        smartling push # Pushes to our translation SaaS

        # ...now what?
    if [ $? = 0 ]; then
        git add adroll/dotcom/i18n/adroll.pot
        git commit -m 'Update POT file'
        git push origin i18n
        git checkout -
        git stash pop # maybe?

This gets very hairy very fast. It’s not even clear how that script would be smarter, and what to do with the error codes and how to handle errors gracefully. And it still won’t let you go pick up where you left off. Plus, that would be that one task, and I would need to do the same work for all of them.

Another thing I dislike with the solutions above is that they’re very verbose. These tasks should be the least disruptive possible for the developers’ workflow. They’re tasks that need to be done, yes. But they’re secondary and if all goes well, I’d rather not see a screenful of output of the various commands.

Enter command_runner

Instead, I wrote a script that gives me the following:

an easy way to create new tasks, without having to manage the various exit codes of each step
if everything goes well, as little output as necessary
if anything goes wrong, a way to fix the issue and pick up where I left off

It defines a run_commands function which takes a function name as an argument. The function name is the name of a function that prints out all the commands, one per line. E.g.

foo () {
    cat <<-EOL
    ls
    ls foo
    cat foo
EOL
}

Then you’d call that as run_commands foo.

You can also pass it a step number as a second argument to start at that step instead, allowing you to pick up where you left off after an error.

E.g. in the above example, the commands would be numbered this way:

0. ls
1. ls foo
2. cat foo

So, calling run_commands foo 2 would only call cat foo.

The meat of run_commands can be summed up in two parts:

    while read -r line; do
        # do something with $line
    done < <(echo "$($1)")

<(echo "$($1)") makes a call to the function name given to run_commands, captures its output and feeds it into the while loop. In that loop, each line (i.e. command to run) can be handled.

The second part is the handling of $line in the loop. The barebone version is:

    echo -e "$line"
    OUT=$($line 2>&1)

In other words, it’s printing the command as a string first, then using that string in a command substitution to run it.

This is where readers become split between the ones that think it’s pretty cool and the ones that think I’m a bash heretic for using a variable as both a string and a command.

It can work but it’s limited and a bit unpredictable because of the way shell expansion works.

Take this example:

> echo "foo"
foo
> a="echo \"foo\""
> echo $a
echo "foo"
> $a
"foo"

echo $a outputs exactly the first command, i.e. echo "foo". But running $a and echo "foo" doesn’t end with the same result. This is due to how the command is parsed and expanded. In this case, it’s very benign but it illustrates the difficulties you can run into.

For the same reason, some commands won’t work or will need some tweaking or further wrapping-in-a-function. E.g. git commit -m "This will fail" should be wrapped into something like that:

commit () {
    git commit -m 'This will work'
}

Then you’d use commit in your list of commands rather than git commit -m "foo".

Beyond that, there’s more code to handle the exit code, stop if necessary, start at a specific step, add nice colors, etc. You can check the full script out in this gist.

Here is how to use `command_runner`:

Let’s work through it with a simple example. I don’t use git rebase much and instead I “merge forward”:

git checkout master
git pull
git checkout -
git merge -

(of course, you can also just do git merge origin/master from your feature branch but let’s use the long version for this example)

1. Take the list of commands and wrap it in cat, inside a function:

mergeforward () {
    cat <<-EOL
    git checkout master
    git pull
    git checkout -
    git merge -
EOL
}

2. You can then wrap the call to run_commands into your own script (let’s call it git-mfw) like so:

#!/usr/bin/env bash
source command_runner

mergeforward () {
    cat <<-EOL
    git checkout master
    git pull
    git checkout -
    git merge -
EOL
}

run_commands mergeforward $@

3. Call git-mfw list to list the steps that are going to happen:

$ ./git-mfw list
  0.git checkout master
  1.git pull
  2.git checkout -
  3.git merge -

4. Imagine you have a local change to some file and run the script:

$ ./git-mfw
✓ 0.git checkout master
✗ 1.git pull
  2.git checkout -
  3.git merge -

Command git pull failed with exit code 1
Output:
From github.com:adroll/test-branch
error: Your local changes to the following files would be overwritten by merge:
	foo/bar/modified.py
Please, commit your changes or stash them before you can merge.
Aborting
Updating 1df3f18..efa27b

5. You would then correct the issue and start where you left off by giving the step number to start at:

$ git reset --hard
$ ./git-mfw 1
~ 0.git checkout master
✓ 1.git pull
✓ 2.git checkout -
✓ 3.git merge -

6. It’s trivial to extend your script by adding one function per task you need to accomplish. In the i18n example, we ended up with push, pull and merge.

Run, commands, run!

While working on this, I came across a lot of webpages that warned against wanting to both print a command and execute it. And for good reasons! It’s obscure, prone to errors and can potentially harm your system if you’re not careful. (to be fair, a lot of things happening in your terminal can)

Like anything, it has limitations but overall it’s been incredibly useful! You can write a nice-looking script that benefits your whole team very very fast and you’ll be a hero.

NextRoll

Easy (and slightly crazy) way of writing bash scripts

The problem

The naive solution

The (trying-to-be-)smarter solution

Enter command_runner

Here is how to use `command_runner`:

Run, commands, run!

Let's get things rolling.

The problem

The naive solution

The (trying-to-be-)smarter solution

Enter command_runner

Here is how to use command_runner:

Run, commands, run!

Let's get things rolling.

Here is how to use `command_runner`: