Writing Rust the Elixir way

November 28, 2020

It’s not a secret that I’m a big fan of Elixir, so when I started doing Rust development I tried to bring some ideas from Elixir to the world of Rust. This post describes some of the tools I’m building to bring the power of Elixir to Rust.

What makes Elixir so great?

It’s hard to just pick a few of them, but I believe that Elixir’s biggest advantage comes from using Erlang as the underlying virtual machine, specially from these 2 properties:

Massive concurrency
Fault tolerance

Massive concurrency

This is something hard to explain until you experience it yourself. I learned early in my career that you should never create a thread while handling a request. Threads are heavy, expensive and too many of them can bring your whole machine down. In most cases it’s enough to use a thread pool, but this approach fails once the number of concurrent tasks outgrows the number of threads in the pool.

Let’s look at an example: imagine a rust application that just creates 2000 threads that wake up every 100 ms and go right back to sleep.

use std::thread;
use std::time::Duration;

fn main() {
    for _ in 0..2_000 {
        thread::spawn(|| loop {
            thread::sleep(Duration::from_millis(100));
        });
    }
    thread::sleep(Duration::from_secs(1_000));
}

Even though the threads don’t do anything, just running this on my MacBook forces it to reboot after a few seconds. This makes it impractical to have massive concurrency with threads. There are many solutions to this problem. The one chosen by Elixir is to abstract concurrent tasks with something called Processes. They are extremely lightweight, so even running 2 million of them doesn’t present a challenge.

Massive concurrency in Rust

You can achieve amazing concurrency and performance using async Rust, but working with async Rust is not as simple as writing regular Rust code and it just doesn’t provide you the same features as Elixir Processes do.

After thinking for a long time how I could make something that reassembles Elixir Processes in Rust I came up with the idea to introduce an intermediate step, WebAssembly. WebAssembly is a low level bytecode specification that Rust can target. The idea was simple, instead of compiling Rust for x86-64 you would compile it to the Wasm target. From there I would build a set of libraries and a WebAssembly runtime that exposes the concept of Rust Processes. Contrary to operating system processes or threads, they are lightweight with small memory footprints, fast to create and terminate, and the scheduling overhead is low. In other languages they are also known as green threads and goroutines, but I will call them processes to stay close to Elixir’s naming convention.

That was the first step towards Lunatic.

Let’s look at the same Rust example, but now implemented with Lunatic. At the same time we will crank up the number of concurrent processes to 20k.

use lunatic::{Channel, Process};

fn main() {
    let channel: Channel<()> = Channel::new(0);

    for _ in 0..20_000 {
        Process::spawn((), process).unwrap();
    }

    channel.receive();
}

fn process(_: ()) {
    loop {
        Process::sleep(100);
    }
}

To run this you will need to compile this Rust code to a .wasm file first:

○ →  cargo build --release --target=wasm32-wasi

Then run it with:

○ →  lunaticvm example.wasm

Contrary to the previous example this runs without hiccups on my Late 2013 Macbook and the CPU utilisation is minimal, even if we are using 10x more concurrent tasks. Let’s examine what is exactly happening here.

The processes spawned by Lunatic are actually taking full advantage of the power provided by async Rust. They are scheduled on top of a work stealing async executor, the same used by async-std. Calling Process::sleep(100) will actually invoke smol’s at function.

Wait a second! How does this work without the .await keyword, you may ask yourself. Lunatic takes the same approach as Go, Erlang and the earlier implementation of Rust based on green threads. It creates a tiny stack for executing the process and grows it when your applications needs more. This is a bit less efficient than calculating the exact stack size during compile time as async Rust is doing, but a reasonable tradeoff I would say.

Now you can write regular blocking code, but the executor will take care of moving your process off the execution thread if you are waiting, so you never block a thread.

As we saw earlier, scheduling threads is a hard task for the operating system. To replace one thread that’s being executed with another one, a lot of work needs to be done (including saving all the registers and some thread state). However, switching between Lunatic Processes does only the minimal amount of work possible. With an idea pioneered by the libfringe library and using some asm! macro magic, Lunatic lets the Rust compiler figure out the minimal number of registers to be preserved during context switches. This makes scheduling Lunatic processes zero-cost. On my machine usually 1ns, equivalent to a function call.

Another benefit of scheduling the Processes in user space instead of using threads is that other applications will continue running normally on your machine, even if your app misbehaves.

Now that we saw how Lunatic allows you to create applications with massive concurrency, let’s look at fault tolerance.

Fault tolerance

Maybe the most known Eralng/Elixir philosophy is “let it crash”. If you are building complex systems it’s impossible to predict all failure scenarios. Inevitably something is going to fail in your application, but this failure should not bring down the whole thing.

Elixir Processes are completely isolated and can only communicate through messages with each other. This allows you to design your application in a way that failure stays contained inside one process and doesn’t affect the rest of them.

Lunatic provides even stronger guarantees than Erlang here. Each Lunatic process gets their own heap, stack and syscalls.

Let’s look at an example of a simple TCP echo server in Lunatic:

use lunatic::{Process, net}; // Once WASI gets networking support you will be able to use Rust's `std::net::TcpStream` instead.
use std::io::{BufRead, Write, BufReader};

fn main() {
    let listener = net::TcpListener::bind("127.0.0.1:1337").unwrap();
    while let Ok(tcp_stream) = listener.accept() {
        Process::spawn(tcp_stream, handle).unwrap();
    }
}

fn handle(mut tcp_stream: net::TcpStream) {
    let mut buf_reader = BufReader::new(tcp_stream.clone());
    loop {
        let mut buffer = String::new();
        buf_reader.read_line(&mut buffer).unwrap();
        tcp_stream.write(buffer.as_bytes()).unwrap();
    }
}

This application listens on localhost:1337 for tcp connections, spawns a process to handle each incoming connection and just echoes incoming lines.

You can test it using telnet:

○ → telnet 127.0.0.1 1337
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
Hello world
Hello world

The first thing you will notice is that we don’t use any async or .await keywords, even though this application will fully utilise Rust’s async IO under the hood.

Also, the tcp connection becomes fully encapsulated in the Process, even if we called into unsafe C code that crashes:

fn handle(mut tcp_stream: net::TcpStream) {
    ...
    unsafe { crashing_c_function() };
    ...
}

The crash is only contained to one connection in this case. It’s not possible to implement something like this in Elixir, because if a call to a C function crashes it will take the whole virtual machine with it.

Another feature exclusive to Lunatic is the possibility to limit processes’ syscall access. If we replaced the previous spawn call with:

// Process::spawn_without_fs is not implemented yet.
Process::spawn_without_fs(tcp_stream, handle).unwrap();

any code called from inside the handle function would be forbidden from using syscalls for filesystem access. This works also for C dependencies, because the enforcement is happening at such a low level. It allows you to express the sandboxing requirements of a Process and to use any dependency without fear. I’m not aware of any other runtime that allows you to do this.

The future

This is just a teaser of the capabilities that Lunatic will provide. There are many more features coming. Once you have this foundation, a new world of possibilities opens up. Some of the features I’m excited about:

The ability to transparently move Processes from one machine to another. The programming model relies on processes communicating through messages and if these messages are sent locally or between different computers on a network it doesn’t really matter.
Hot reloading. Now that we have the Wasm bytecode as an in-between step it becomes possible to just generate new JIT machine code from it and replace it while the whole system is still running.
Running complete applications compiled to Wasm as a process. One example would be redirecting file read/writes from the application to tcp streams, as we are in complete charge of syscalls. The advantage here is that you are modelling the execution environment with code.

Lunatic is still in its early days, so there is a lot of development left to do. If you are excited about it or have some ideas you would like to use Lunatic for, reach out to me over email me@kolobara.com or on twitter @bkolobara.

I also want to use this opportunity to say a big thank you to the teams working on Rust, Wasmer, Wasmtime, Lucet and waSCC. It would be impossible to build Lunatic without all the hard work put into this projects.

P.S. If you would like to learn more about the magic of Erlang and Elixir, this is one of my favorite talks about it by Saša Jurić: The Soul of Erlang and Elixir. Seriously, go and watch it!