I've been working on a program called seamstress for the past couple of months. I think it's in a pretty good spot, even if it's early days. The program is heavily inspired by monome's norns, specifically its matron program, which embeds the Lua programming language to speak with monome's grid and arc controllers, MIDI, OSC, a screen, the norns' keys and encoders, and so forth. One big difference between seamstress and matron is, besides the fact that seamstress is designed for desktops/laptops rather than a purpose-built device, is that seamstress is written in Zig, a young C-like programming language. The purpose of this post is to describe seamstress's event loop.

By the way, since Zig is likely to change, I'm writing this in June 2023, with a development version of Zig version 0.11.0.

Table of Contents

1. Roll your own linked list

1.1. Nodes

A linked list is a standard data structure where elements of the list know who comes before and after them and the list itself knows who is at the start and the end. It's easy enough to roll your own linked list, so let's do it. Strictly speaking, one does not need to do this; Zig's standard library has things like std.TailQueue which work great. But it's not too hard (she says having spent an afternoon hashing out precisely how to do it), and it feels good to know what goes into things like this. So let's define a node in a linked list.

const Node = struct {
    next: ?*Node,
    prev: ?*Node,
    data: *Data,
};

Okay hi, hello, welcome to Zig. Let's talk through it. So we're declaring a new struct type called Node. Because we don't plan on altering the type itself, we mark it const (as opposed to var). This struct has three fields, next, prev and data. The data field is a pointer to a struct with type Data. We haven't yet defined Data, but that's okay. Okay and pointers? It's funny, I grew up reading my dad's C programming language books, but many powerful programming languages don't have a concept of a pointer; pointers are messy. The idea is that the pointer is not the object itself, but a little thing that says "hi, you can find what you're looking for at this memory address." Zig has pointers.

The usefulness of pointers becomes clear when we consider the next and prev fields: forgetting about the ? for a moment, notice that they're of type Node themselves! So we have to use a pointer here, otherwise a Node would have to contain a Node which contains a Node etc. etc. Okay anyway the ? says that our Node entries are allowed to be null, that is, not to point to something valid. (Notice that this implies that the data entry must always be valid! That's amazing!)

1.2. Queues

Cool, so far so good. Let's put nodes into a queue… literally:

const Queue = struct {
  const Node = struct {
      next: ?*Node,
      prev: ?*Node,
      data: *Data,
  };
  head: ?*Node,
  tail: ?*Node,
  size: usize,
};

If you know a little C, we did something a bit surprising: we put the definition of a Node inside the definition of Queue. Zig lets you do this; it effectively "namespaces" Node, which is maybe neither here nor there, so that's all I'll say about it.

1.3. Failing to push to a queue

Okay so a queue has a head and a tail, which may be null, and a size, which is a usize (an unsigned integer type appropriate for, well, sizes). How should we interact with it? let's write some functions. Zig lets us put them inside of Queue too, but I'll present them by themselves. Okay warning, for, like, exposition purposes, I'm giving you incorrect code here.

fn push(self: *Queue, data: Data) void {
  var new_node = Node{
      .next = null,
      .prev = null,
      .data = data,
  };
  if (self.tail) |n| {
      self.tail = &new_node;
      n.next = &new_node;
      new_node.prev = n;
  } else {
      std.debug.assert(self.size == 0);
      self.tail = &new_node;
      self.head = &new_node;
  }
  self.size += 1;
}
  

In fact, while this looks beguilingly like correct code, it won't compile, and not just because we haven't defined the Data type! I'll get to that in a moment. First let's make sense of more Zig syntax.

We declare a function with the fn keyword. This function is named push and it takes in two arguments: self, which is a pointer to a Queue, and data, which is some Data. The function returns void, i.e. it gives you nothing back. We declare variables with the var keyword. We're making new_node a struct object of type Node and assigning to its fields by accessing them with the dot syntax. (Shortly we'll meet another way to declare elements of struct type.) The if statement has an interesting |n| bit after it. The point is that self.tail is allowed to be null. We say if it's not null, then give us what it is, and capture that in a new (const) variable n. Since self.tail expects to be a pointer to a node and not a node, we assign to it the address or reference of new_node by using the & operator, like in C. So you should read the if statement as saying "if self.tail is not null, we assign the reference of new_node to it, we set the next field of the old tail, i.e. n, to new_node (or its address) and set the prev field of the new node to n." If not, the else statement says "well first of all, the size had better be zero or we're bailing, and if so we set the head and tail of self to our new node." Finally in either case we add one to the size of our queue.

1.3.1. Testing our failing code

So what goes wrong? The immediate problem that the compiler will complain to you about is that data the function argument and .data the field are of different types: one is a Data and the other is a *Data. You can try and sneak around this in various ways, but there's really a problem lurking in the background that the following test should reveal.

test {
  var queue = Queue{
      .head = null,
      .tail = null,
      .size = 0,
  };
  var i: u8 = 0;
  while (i < 100) : (i += 1) {
      queue.push(i);
  }
  try std.testing.expect(queue.size == 100);
  i = 0;
  var node = queue.head;
  while (i < 100) : (i += 1) {
      try std.testing.expect(node.?.data.* == i);
      node = node.?.next;
  }
}

(To make this test work on your code you'll need to find some way to fix the compiler error above and replace Data by the u8 type.) First of all, it's neat that Zig builds in a testing framework like this. Anyway, I won't explain this test in detail, beyond the line that will fail without bigger fixes:

try std.testing.expect(node.?.data.* == i);.

Don't worry about the try; we'll meet it later. The .? "unwraps" the possibility for node to be null and gives an error if it is null. It's similar to the if we saw above, but, like, more compact and also probably should only be used if we should fail if the result is null. The .* "dereferences" the pointer and gives us the value of what it points to, rather than the pointer itself. So we're essentially expecting that the ith element of our linked list should have .data field i, which is reasonable, given that that's what we pushed previously! But… it's not. This test fails. I'd like to tell you my hypothesis about why.

1.3.2. The stack vs. the heap

One way to imagine how a program executes is the "call stack". A program pushes function calls and so forth onto the stack and the CPU pops things off of the stack and executes them. Declaring variables can work like this too; the CPU assigns some memory to the variable and it lives until it goes out of "scope". For example, in our test, the variable i lives during the whole test block. If we declared a const or var inside of the while loop, it would be born anew each iteration and die each time we reach the closing brace. And this is really the problem: when we call queue.push(i) (which is, to my knowledge, equivalent to calling Queue.push(&queue, i)) the variable new_node dies at the end of the function call. Its memory is no longer guaranteed to represent what we want it to.

So we need a way of creating Node objects that live longer than the scope of push. This means that we have to ask the computer for memory that we're going to hold onto! Memory like this is often referred to as "heap allocated" (as opposed to "stack allocated"). Memory from the heap continues to belong to the program until it exits, so it's the responsibility of the program to let go of it when its no longer needed, otherwise we say a program "leaks" memory.

1.4. Allocating memory in Zig

In C, it's very easy to leak memory; just call malloc (short for memory allocation, but I like to imagine a little devil, "mal" meaning bad, named malloc) without later calling free. In Zig, it's still possible to leak memory, but the language will try to help you out. First of all, we need to choose an allocator, which is of type std.mem.Allocator. Zig gives you several choices; maybe the best one for most purposes is the "general purpose allocator". Let's grab it.

var allocator: std.mem.Allocator = undefined;

pub fn main() void {
  var general_allocator = std.heap.GeneralPurposeAllocator(.{}) {};
  allocator = general_allocator.allocator();
  defer _ = general_allocator.deinit();
}

It's possible that this is an anti-pattern: I've declared my allocator as if it were a static variable in C. It's not initialized until we enter main, where we grab it in two stages. In seamstress, I then tend to pass the allocator to other source files in their init functions.

Anyway, so much for the allocator. We'll often call allocator.alloc() to make many copies of something (the Zig word is "slices"), or allocator.create() to create single objects.

1.4.1. Correctly pushing to a queue

Let's correct our push function so that it allocates a Node from the heap instead.

fn push(self: *Queue, data: *u8) !void {
  var new_node = try allocator.create(Queue.Node);
  new_node.next = null;
  new_node.prev = null;
  new_node.data = data;
  if (self.tail) |n| {
      self.tail = new_node;
      n.next = new_node;
      new_node.prev = n;
  } else {
      std.debug.assert(self.size == 0);
      self.tail = new_node;
      self.head = new_node;
  }
  self.size += 1;
}

test {
  allocator = std.testing.allocator;
  var queue = .{
      .head = null,
      .tail = null,
      .size = 0,
  };
  var i: u8 = 0;
  while (i < 100) : (i += 1) {
      var j = try allocator.create(u8);
      j.* = i;
      try queue.push(j);
  }
  try std.testing.expect(queue.size == 100);
  i = 0;
  var node = queue.head;
  while (i < 100) : (i += 1) {
      try std.testing.expect(node.?.data.* == i);
      node = node.?.next;
  }
}

Now technically this test passes, (provided you replace references to Data types with u8) but it yells at us for leaking memory right and left, and rightly so: we have 100 little u8 variables and 100 bigger Node variables just sort of hanging out on the heap when we end the program.

Notice as well that we had to make a change to our functions call-signature: by adding the ! we're acknowledging that our function may return an error type: that's also why we have try in front of allocator.create: memory allocation may fail! That's a fact of life, and Zig rightly forces you to acknowledge it. The try statement says "attempt to do the following function; if we get an error back, just chuck it further up the chain." Instead of try we could write catch and handle the error in the function itself. I'd rather seamstress just fail if it can't allocate memory, so I don't do this.

So let's correct the leak by providing a pop function. I'll return to acting as though we've defined a Data type that maybe comes with new and deinit methods.

fn pop(self: *Queue) ?*Data {
  if (self.head) |n| {
      const data = n.data;
      self.head = n.next;
      allocator.destroy(n);
      if (self.size == 1) self.tail = null;
      self.size -= 1;
      return data;
  } else {
      std.debug.assert(self.size == 0);
      return null;
  }
}

test {
  var queue = .{
      .head = null,
      .tail = null,
      .size = 0,
  };
  allocator = std.testing.allocator;
  var i: u8 = 0;
  while (i < 100) : (i += 1) {
      var datum = Data.new(i);
      try queue.push(datum);
  }
  i = 0;
  while (i < 100) : (i += 1) {
      const datum = queue.pop();
      std.testing.expect(datum != null);
      std.testing.expect(datum.?.i == i);
      datum.deinit();
  }
}

Obviously this "test" is missing some scaffolding to make it actually work, but hopefully it gets across the idea. The pop() function removes from the head of the queue provided the queue has size at least one. Once it's done, it calls destroy() on the node that we got, so we don't leak memory from zombie Node objects.

2. An event loop

Seamstress, like many programs that interface with devices from the outside world, spends a lot of its time waiting for something to happen. The main loop might at firsst look something like this:

pub fn loop() !void {
  while (!quit) {
      if (queue.size == 0) continue;
      const data = queue.pop();
      if (data) |d| try handle(d);
  }
}

There are two things wrong with this. The first is that seamstress will run at 100% CPU with this model even when there are no events in the queue; it's burning those cycles repeatedly checking the size of the queue. The other thing is that in order for this event handling model to make sense, we need something to deliver events to us. If we're stuck in the event loop, the only way for something else to happen is if it happens in another thread. Multi-threading is very fun, but it also creates the possibility for bad behavior: what should happen, for example, if multiple threads try to push to the queue at the same time? Or what if I try to pull the only event from the queue while someone else tries to push to it, who controls how the size parameter ends up?

Zig's standard library helps us solve both of those problems: we'll go back to our Queue struct and add two new fields. I'll also take this time to put our push and pop methods inside the Queue struct.

var quit = false;
var allocator: std.mem.Allocator = undefined;
var queue: Queue = undefined;

const Queue = struct {
  const Node = {
      next: ?*Node,
      prev: ?*Node,
      data: *Data,
  };
  head: ?*Node,
  tail: ?*Node,
  size: usize,
  lock: std.Thread.Mutex,
  cond: std.Thread.Condition,
  fn push(self: *Queue, data: *Data) void {
      var new_node = allocator.create(Node);
      new_node.prev = null;
      new_node.next = null;
      new_node.data = data;
      if (self.tail) |n| {
          self.tail = new_node;
          n.next = new_node;
          new_node.prev = n;
      } else {
          std.debug.assert(self.size == 0);
          self.tail = new_node;
          self.head = new_node;
      }
      self.size += 1;
  }
  fn pop(self: *Queue) ?*Data {
      if (self.head) |n| {
          const data = n.data;
          self.head = n.next;
          allocator.destroy(n);
          if (self.size == 1) self.tail = null;
          self.size -= 1;
          return data;
      } else {
          std.debug.assert(self.size == 0);
          return null;
      }
  }
}

pub fn post(data: *Data) !void {
  queue.lock.lock();
  try queue.push(event);
  queue.cond.signal();
  queue.lock.unlock();
}

pub fn loop() !void {    
  while (!quit) {
      queue.lock.lock();
      while (queue.size == 0) {
          if (quit) break;
          queue.cond.wait(&queue.lock);
      }
      const data = queue.pop();
      queue.lock.unlock();
      if (data) |d| try handle(d);
  }
}

This code is better for threading and better for our CPU. Now while the queue has size zero, we wait(), releasing the lock and sleeping until some other thread calls signal() on our cond. Code in other files doesn't call push() or pop() directly, but instead should add to the queue with post() and our main thread will handle events in loop().

3. Seamstress's event loop

The above works fine, but it kind of bothered me that push(), and hence post(), required a try, and it reminded me that allocating memory can be slow. Moreover, if the event loop falls behind to such an extent that the queue is bigger than some reasonably small number, it's likely that the program is doing something pretty bad; we should try to avoid this.

The solution I hit upon was to preallocate a pool of 1000 Node and Data objects and keep them recirculating while the program runs. This also has a nice added side-effect: rather than having to define a new() method for creating pointers to Data objects, post() can simply accept a statically-allocated Data object, which the program then sets the value of the Data pointer to. The post() and loop() functions as above work just the same, although we can remove the ! and try from post() as it no longer errors since there's no memory allocation happening. The code around the Queue struct now looks like this:

const Queue = struct {
  const Node = struct{
      next: ?*Node,
      prev: ?*Node,
      data: *Data,
  };
  read_head: ?*Node,
  read_tail: ?*Node,
  read_size: usize,
  write_head: ?*Node,
  write_tail: ?*Node,
  write_size: usize,
  lock: std.Thread.Mutex,
  cond: std.Thread.Condition,
  inline fn get_new(self: *Queue) *Node {
      var node = self.write_head orelse {
          @setCold(true);
          std.debug.assert(self.write_size == 0);
          std.debug.print("no nodes free!\n", .{});
          unreachable;
      };
      self.write_head = node.next;
      node.next = null;
      node.prev = null;
      self.write_size -= 1;
      return node;
  }
  inline fn return_to_pool(self: *Queue, node: *Node) void {
      if (self.write_tail) |n| {
          self.write_tail = node;
          n.next = node;
          node.prev = n;
      } else {
          @setCold(true);
          std.debug.assert(self.write_size == 0);
          self.write_head = node;
          self.write_tail = node;
      }
      self.write_size += 1;
  }
  fn push(self: *Queue, data: Data) void {
      var new_node = self.get_new();
      new_node.data.* = data;
      if (self.read_tail) |n| {
          self.read_tail = new_node;
          n.next = new_node;
          new_node.prev = n;
      } else {
          std.debug.assert(self.read_size == 0);
          self.read_tail = new_node;
          self.read_head = new_node;
      }
      self.read_size += 1;
  }
  fn pop(self: *Queue) ?*Data {
      if (self.read_head) |n| {
          const data = n.data;
          self.read_head = n.next;
          self.return_to_pool(n);
          if (self.read_size == 1) self.read_tail = null;
          self.read_size -= 1;
          return data;
      } else {
          std.debug.assert(self.read_size == 0);
          return null;
      }
  }
  fn deinit(self: *Queue) void {
      // assumes the read queue is empty
      var node = self.write_head;
      while (node) |n| {
          node = n.next;
          allocator.destroy(n.data);
          allocator.destroy(n);
      }
  }
}

pub fn init(alloc_ptr: std.mem.Allocator) !void {
  allocator = alloc_ptr;
  queue = .{
      .read_head = null,
      .read_tail = null,
      .read_size = 0,
      .write_tail = null,
      .write_head = null,
      .write_size = 0,
      .cond = .{},
      .lock = .{},
  };
  var i: u16 = 0;
  while (i < 1000) : (i += 1) {
      var node = try allocator.create(Queue.Node);
      var data = try allocator.create(Data);
      data.* = undefined;
      node.* = .{ ev = data, .next = null, .prev = null };
      queue.return_to_pool(node);
  }
}

Let me explain two pieces: @setCold is, as I understand it, essentially an instruction to the optimizer that the block of code containing it should be called very rarely. Indeed, if we're out of Node objects, I want the program to fail, hence why I put the keyword unreachable there. Similarly, adding the first node to the pool of available nodes should happen only once in a correctly functioning run of seamstress, so optimizing so that branch is often ignored feels right.

I'm very happy with how this structure is working for me in seamstress at the moment. You can check for yourself the full contents of events.zig at GitHub.