Interested in working with us? We are hiring!

See open positions

Runtime errors: Come again? Rust macros to the rescue.

Petar Dambovaliev Written by Petar Dambovaliev, June 21, 2022

We admit that this article is blatant Rust propaganda. If you are rust-curious, this is the place for you. We want to show you how the benefits of using Rust go beyond the language itself and trickle down to your whole ecosystem.

Dynamic languages can undoubtedly be fun and convenient. However, always being dynamic is not necessarily a good thing. The word on the street is that unit testing is driven by dynamic languages precisely because they do not have a static type system. We can eliminate whole classes of bugs by using a statically typed language. So, we shall use it.

Rust’s metaprogramming features allow us to be dynamic while preserving stability and safety. If you are unfamiliar with Rust macros and have a macro-phobia from C++, they are not the same. You should check them out here.

Background

We are using Aerospike (distributed key-value store). It has a feature where you can register and execute UDF (user-defined functions) written in Lua. The caveat is that Aerospike has some restrictions on the language, like reserved keywords, disallowing globals, and such.

So even with a modern IDE and linter, you do not get proper validation until you have registered the code in your Aerospike node. You might say, “Hold on, what does Lua have to do with Rust?”. Well, if you happen to use it from a Rust application, you can write a little macro that enforces the Aerospike rules on Lua and runs the Lua interpreter to catch the relevant errors at compile time.

What we want to achieve

At this point, we do not even need .lua files.

Here is how it would look. Notice the Lua source code inside the macro invocation.

use aerospike_code_gen::define;

fn main() {
    define! {
      function my_func(record)
      end
    };
}

Since record is a reserved Aerospike identifier, we get a compilation error.

error: local name reserved: `record`

--> tests/main.rs:5:24
  |
5 |       function my_func(record)
  |                        ^^^^^^
  |

= note: this error originates in the macro `define` (in Nightly builds, run with -Z macro-backtrace for more info)

Development

How can we make this work? We need to create a Rust proc macro project and define our dependencies.

[dependencies]
quote = "1"
proc-macro2 = "1.0"
syn = "1.0"
rlua = "0.19.1"
luaparse = "0.2.0"

We need rlua to run the interpreter through and luaparse to parse the code to enforce the Aerospike requirements.

The other dependencies are the default libraries one would use to build macros.

Now, it is only a matter of parsing the macro input via the luaparse crate, recursively checking the definitions, and we are done.

In the actual argument of kings, Linus Torvalds said: “Talk is cheap. Show me the code”.

#[proc_macro]
pub fn define(input: TokenStream) -> TokenStream {
    let s = input.to_string();
    let mut lua_err = None;

    Lua::new().context(|lua| {
        let chunk = lua.load(&s);
        let r = chunk.exec();
        if let Err(err) = r {
            lua_err = Some(err.to_string());
        }
    });

    if let Some(err) = lua_err {
        return SynError::new(Span::call_site(), err)
            .into_compile_error()
            .into();
    }
}

This was easy enough, and it will give us the errors from the Lua interpreter.

The next step involves a little more work: validating the Aerospike requirements.

First, we create a function that will be called from the macro.

luaparse::parse is going to give us the entire AST.

fn validate_aerospike(s: &str) -> Vec<String> {
    let mut errs = vec![];

    match luaparse::parse(s) {
        Ok(block) => {
            loop_statements(&block.statements, &mut errs, true);
        }

        Err(e) => panic!("{:#}", LError::new(e.span(), e).with_buffer(s)),
    }

    errs
}

loop_statements is where our real work begins.

fn loop_statements(stmts: &Vec<Statement>, errors: &mut Vec<String>, is_global: bool) {
    for statement in stmts {
        recurse(statement, errors, is_global);
    }
}

We need to go through the entire tree. For your sanity, the match statement in this function is shortened for this example.

fn recurse(stmt: &Statement, errors: &mut Vec<String>, mut is_global: bool) {
    let mut allow_vars = true;

    if is_global {
        is_global = false;
        allow_vars = false;
    }

    match stmt {
        Statement::FunctionDeclaration(func) => match func {
            FunctionDeclarationStat::Local { body, .. } => {
                validate_func(&body, errors, is_global);
            }

            FunctionDeclarationStat::Nonlocal { body, .. } => {
                validate_func(&body, errors, is_global);
            }
        },

        Statement::LocalDeclaration(ld) => {
            let names = ld.names.pairs.iter().map(|a| a.0.clone()).collect();
            validate_names(&names, errors, allow_vars);
        }

        _ => {}
    }
}

Here is where we check for the reserved Aerospike names.

const AEROSPIKE_NAMES: [&str; 9] = [
    "record",
    "map",
    "list",
    "aerospike",
    "bytes",
    "geojson",
    "iterator",
    "list",
    "stream",
];

fn is_reserved(s: &str) -> bool {
    for aerospike_name in AEROSPIKE_NAMES {
        if aerospike_name == s {
            return true;
        }
    }

    false
}

fn validate_names(names: &Vec<Name>, errors: &mut Vec<String>, allow_vars: bool) {
    for param in names {
        let name = param.to_string();

        if is_reserved(&name) {
            errors.push(format!(
                "aerospike reserved identifier: `{}`. consider renaming your variable",
                name
            ));
        }

        if !allow_vars {
            errors.push(format!("global variables are not allowed: `{}`", name));
        }
    }
}

fn validate_func(body: &FunctionBody, errors: &mut Vec<String>, is_global: bool) {
    let names = body.params.list.pairs.iter().map(|a| a.0.clone()).collect();
    validate_names(&names, errors, true);
    loop_statements(&body.block.statements, errors, is_global);
}

You can view a complete working example here.

This tremendously improves the development process, as one does not need to run code to register it on the Aerospike cluster to see an error that should have been caught at compile time.

Conclusion

We all know the Rust trifecta - safe, fast and concurrent.

Rust allows us to write the code we want while preserving those three inherent qualities.

However, there is no free lunch in this world.

While there have been many improvements in compilation speed, adding more macros will exacerbate the problem if you have an extensive project.