a simple semi-space collector

Good day, hackfolk. Today’s article is about semi-space collectors. Many of you know what these are, but perhaps not so many have seen an annotated implementation, so let’s do that.

Just to recap, the big picture here is that a semi-space collector divides a chunk of memory into two equal halves or spaces, called the fromspace and the tospace. Allocation proceeds linearly across tospace, from one end to the other. When the tospace is full, we flip the spaces: the tospace becomes the fromspace, and the fromspace becomes the tospace. The collector copies out all live data from the fromspace to the tospace (hence the names), starting from some set of root objects. Once the copy is done, allocation then proceeds in the new tospace.

In practice when you build a GC, it’s parameterized in a few ways, one of them being how the user of the GC will represent objects. Let’s take as an example a simple tag-in-the-first-word scheme:

struct gc_obj {
  union {
    uintptr_t tag;
    struct gc_obj *forwarded; // for GC
  };
  uintptr_t payload[0];
};

We’ll divide all the code in the system into GC code and user code. Users of the GC define how objects are represented. When user code wants to know what the type of an object is, it looks at the first word to check the tag. But, you see that GC has a say in what the representation of user objects needs to be: there’s a forwarded member too.

static const uintptr_t NOT_FORWARDED_BIT = 1;
int is_forwarded(struct gc_obj *obj) {
  return (obj->tag & NOT_FORWARDED_BIT) == 1;
}
void* forwarded(struct gc_obj *obj) { 
  return obj->forwarded;
}
void forward(struct gc_obj *from, struct gc_obj *to) {
  from->forwarded = to;
}

forwarded is a forwarding pointer. When GC copies an object from fromspace to tospace, it clobbers the first word of the old copy in fromspace, writing the new address there. It’s like when you move to a new flat and have your mail forwarded from your old to your new address.

There is a contract between the GC and the user in which the user agrees to always set the NOT_FORWARDED_BIT in the first word of its objects. That bit is a way for the GC to check if an object is forwarded or not: a forwarded pointer will never have its low bit set, because allocations are aligned on some power-of-two boundary, for example 8 bytes.

struct gc_heap;

// To implement by the user:
size_t heap_object_size(struct gc_obj *obj);
size_t trace_heap_object(struct gc_obj *obj, struct gc_heap *heap,
                         void (*visit)(struct gc_obj **field,
                                       struct gc_heap *heap));
size_t trace_roots(struct gc_heap *heap,
                   void (*visit)(struct gc_obj **field,
                                 struct gc_heap *heap));

The contract between GC and user is in practice one of the most important details of a memory management system. As a GC author, you want to expose the absolute minimum interface, to preserve your freedom to change implementations. The GC-user interface does need to have some minimum surface area, though, for example to enable inlining of the hot path for object allocation. Also, as we see here, there are some operations needed by the GC which are usually implemented by the user: computing the size of an object, tracing its references, and tracing the root references. If this aspect of GC design interests you, I would strongly recommend having a look at MMTk, which has been fruitfully exploring this space over the last two decades.

struct gc_heap {
  uintptr_t hp;
  uintptr_t limit;
  uintptr_t from_space;
  uintptr_t to_space;
  size_t size;
};

Now we get to the implementation of the GC. With the exception of how to inline the allocation hot-path, none of this needs to be exposed to the user. We start with a basic definition of what a semi-space heap is, above, and below we will implement collection and allocation.

static uintptr_t align(uintptr_t val, uintptr_t alignment) {
  return (val + alignment - 1) & ~(alignment - 1);
}
static uintptr_t align_size(uintptr_t size) {
  return align(size, sizeof(uintptr_t));
}

All allocators have some minimum alignment, which is usually a power of two at least as large as the target language’s ABI alignment. Usually it’s a word or two; here we just use one word (4 or 8 bytes).

struct gc_heap* make_heap(size_t size) {
  size = align(size, getpagesize());
  struct gc_heap *heap = malloc(sizeof(struct gc_heap));
  void *mem = mmap(NULL, size, PROT_READ|PROT_WRITE,
                   MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
  heap->to_space = heap->hp = (uintptr_t) mem;
  heap->from_space = heap->limit = space->hp + size / 2;
  heap->size = size;
  return heap;
}

Making a heap is just requesting a bunch of memory and dividing it in two. How you get that space differs depending on your platform; here we use mmap and also the platform malloc for the struct gc_heap metadata. Of course you will want to check that both the mmap and the malloc succeed :)

struct gc_obj* copy(struct gc_heap *heap, struct gc_obj *obj) {
  size_t size = heap_object_size(obj);
  struct gc_obj *new_obj = (struct gc_obj*)heap->hp;
  memcpy(new_obj, obj, size);
  forward(obj, new_obj);
  heap->hp += align_size(size);
  return new_obj;
}

void flip(struct gc_heap *heap) {
  heap->hp = heap->from_space;
  heap->from_space = heap->to_space;
  heap->to_space = heap->hp;
  heap->limit = heap->hp + heap->size / 2;
}  

void visit_field(struct gc_obj **field, struct gc_heap *heap) {
  struct gc_obj *from = *field;
  struct gc_obj *to =
    is_forwarded(from) ? forwarded(from) : copy(heap, from);
  *field = to;
}

void collect(struct gc_heap *heap) {
  flip(heap);
  uintptr_t scan = heap->hp;
  trace_roots(heap, visit_field);
  while(scan < heap->hp) {
    struct gc_obj *obj = scan;
    scan += align_size(trace_heap_object(obj, heap, visit_field));
  }
}

Here we have the actual semi-space collection algorithm! It’s a tiny bit of code about which people have written reams of prose, and to be fair there are many things to say—too many for here.

Personally I think the most interesting aspect of a semi-space collector is the so-called “Cheney scanning algorithm”: when we see an object that’s not yet traced, in visit_field, we copy() it to tospace, but don’t actually look at its fields. Instead collect keeps track of the partition of tospace that contains copied objects which have not yet been traced, which are those in [scan, heap->hp). The Cheney scan sweeps through this space, advancing scan, possibly copying more objects and extending heap->hp, until such a time as the needs-tracing partition is empty. It’s quite a neat solution that requires no additional memory.

inline struct gc_obj* allocate(struct gc_heap *heap, size_t size) {
retry:
  uintptr_t addr = heap->hp;
  uintptr_t new_hp = align_size(addr + size);
  if (heap->limit < new_hp) {
    collect(heap);
    if (heap->limit - heap->hp < size) {
      fprintf(stderr, "out of memory\n");
      abort();
    }
    goto retry;
  }
  heap->hp = new_hp;
  return (struct gc_obj*)addr;
}

Finally, we have the allocator: the reason we have the GC in the first place. The fast path just returns heap->hp, and arranges for the next allocation to return heap->hp + size. The slow path calls collect() and then retries.

Welp, that’s a semi-space collector. Until next time for some notes on ephemerons again. Until then, have a garbage holiday season!

Comments are closed.