Using Handles and Intrusive Reference Counters in Multithreaded Environment in C Language

Vladimir Pavluk

5.00/5 (4 votes)

Sep 15, 2015

CPOL

11 min read

11992

Smart pointers implementation in C

Introduction

Accessing one piece of data in multiple threads is considered bad practice, but in many cases it is inevitable, and it is not the question that is discussed here. The question discussed here is how to organize such an access the safest way.

In a multithreaded environment when using an object or a data structure, one of the main concerns besides other things is making sure that the object is still alive and the memory allocated to the structure is not freed. This can be done in multiple ways, the ones we're going to talk about are handles and reference counters.

Handles are small structures that contain the pointer to the data object and the ancillary data to ensure that the object is still alive. Usually, there are two functions to operate handles: lock_handle and unlock_handle (the names are chosen arbitrarily just to show the functionality).

Lock_handle checks the availability of the object, increments the atomic reference count and returns the pointer to the data object, if it's available. If not, it returns NULL pointer, or uses some other way to signal that the object is not available anymore. Accordingly to its name, unlock_handle atomically decrements reference count and whenever it reaches 0 deletes the object. Intrusive reference counters are atomic integers embedded inside data object which tell how many times in the program the object is being used. As soon as the reference counter reaches value 0, the object is deleted.

Now let's take a look at the benefits and common pitfalls of both strategies and determine a way to choose which one is better to use in particular cases. Let's take a look at the intrusive reference counters first.

Intrusive Reference Counters

Intrusive reference counters as evident from the name need an intrusion to the counted data structure: a modification, which adds the counter. On a simple structure, let's call it data_packet_s, the modification may look like that:

struct data_packet_s {
  void *data_buffer;
  size_t data_buffer_size;
};

struct data_packet_s {
  void *data_buffer;
  size_t data_buffer_size;
  volatile int reference_count;
};

So, this is a downside of this approach: it needs object modification, therefore we can use it only for the data structures that we can modify. We won't be able to use it on some arbitrary library structures.

It's curious, but this same fact will be a benefit too. The benefit is that we do not need any additional structures, nor do we need any additional memory allocations for such structures.

Another downside, or rather a specifics of this approach is the following: the accessibility of the object must be guaranteed when incrementing reference counter. In other words, we can't simply store the pointer to the object and wait for the moment when we need to access it, then just try to increment the reference counter and work with the object, because between the moment we've stored the pointer and the moment we start using it, object destruction may have occurred. Let's demonstrate an easiest case of such condition using a simple conventional language.

Thread 1	Thread 2
Create `object1`
Increment reference to `object1`
Start thread 2, passing `object1` as an argument	Start
Decrement reference to `object1`, refcount reaches 0	Store `object1`
Delete `object1` along with intrusive reference counter	Wait
	Increment reference counter => memory access violation

In this case, to ensure the object is alive, you would have to increment reference count in Thread 1, and pass the ownership of this particular instance to Thread 2 (decrement the refcount in Thread 2 after the object is no longer needed). In other words, this scheme can work pretty well if you have to process the object once in another thread, and forget about it. This can be demonstrated by the following use case:

Thread 1	Thread 2
Create `object1`
Increment reference to `object1`
Start thread 2, passing `object1` as an argument	Start
	Store `object1`
	Process `object1`
	Decrement reference counter, `refcount` reaches `0`
	Delete `object1`

This scheme can be used with as many threads as necessary, providing that each thread that increments reference counter, owns at least one instance (have incremented reference counter and didn't pass the ownership of the instance, or received the ownership of an instance from another thread, like Thread 2 in the last demonstrated case). So in the last case, Thread 1 can increment reference several times (one per thread) and only after that, it may start several threads, per number of times the refcount was incremented.

So, there comes the second downside: you increment and decrement the reference counter in different threads, which is extremely error-prone. It can lead to different kinds of errors, starting from memory leaks when we fail to release counter, to possible double release in some of the threads, leading to falsely releasing the object and thus deleting it when it's still in use by other thread(s).

This scheme introduces so called 'shared ownership', when the threads collectively own the object, and the last thread decrementing the reference counter will delete the object.

And if you want to store the pointer to the object in another thread and use it when necessary, you will see the memory usage of your program growing up with each object stored, because the thread storing reference will never free it because it has to be sure that the object is alive. Another pitfall in this approach is accessing indirectly passed objects, i.e., objects that are passed by pointer contained in one of other directly or indirectly passed objects, that reference but do not own the object. All such objects should be carefully noticed and reference counted, and owned by each working thread that passes or receives such indirect reference. So, let's summarize the downsides and benefits of the intrusive reference counter approach:

Downsides	Benefits
It's intrusive, i.e., it needs object modifications	Doesn't need additional structures, increasing number of memory operations
A reference has to be held to ensure object and its memory are still valid
Ownership of references has to be passed across threads
Needs careful tracking of indirectly referenced objects

Handles

Handles are light-weight structures that are passed by value, which reference object, manage references and ensure object integrity.

Obviously, handles that reference one object will share same reference counter object. That generally means that such reference counter has to be allocated on heap. A simple handle structure that first comes to mind is the following:

struct handle_s {
  volatile int *reference_count;
  void *object;
};

Where reference_count is allocated when the first handle is created. The first downside of this approach is obvious. We need to manage one more additional structure, allocate another memory chunk for the reference counter. But it pays off when you want to use reference counting with structures you don't have access to.

Typical usage of such handle will be the following:

struct some_struct_s *object = lock_handle(hdata);
if(object) {
  use(object);
  release_handle(hdata);
}

Let's see what happens when a last reference to the object is gone. First of all, we obviously want to delete the managed object. If the object is a simple memory region, we'd just want to free() the memory used by the object. But in most cases, it is not so. Like in aforementioned data_packet_s example, where we would want to free() the data_buffer memory as well. If we use handle for just one type of object, that doesn't make a big problem. But if we want to handle different types, that introduces another question: how to delete the handled object?

Now here we can add another piece of information to the handle: the function pointer to use for destroying/freeing the handled object. Now the handle looks like the following:

struct handle_s {
  volatile int *reference_count;
  void *object;
  void (*destroy_object)(void*);
};

Now release_handle doesn't have to know the object specifics to delete it, it just has to use a right function, which we already know from the handle, and let's return back to what happens when the last reference is released.

Then we'd want to release the memory that is taken by reference_count. But that would be terribly wrong. If other instances of the handle are still stored, they will reference reference_count that is already deleted. And on the next try to lock handle we will access memory that's already freed.

What is a solution that would allow not to loose memory constantly and still maintain validity of all accesses to shared reference counters? There is one. It's an object pool that will manage free reference counters. Although here comes another problem, known as ABA problem. Imagine a situation where you have a handle to an object. Then one of the threads deletes the object. Then another object is created and acquired a handle to manage. What happens?

When the object is destroyed, the reference_count associated with this object (let's call it object1) is freed back to the object pool with the value of 0. Everything is fine up to this point. But when another handle for some new object (let's call it object2) is allocated, the reference_count that will be associated with this object is taken from the object pool, and given reference_count of 1. Now imagine that the thread storing a handle to the object1 tries to acquire the pointer. It will succeed, because the reference_count is not 0, although it belongs now to the object2. The lock function will return invalid pointer, the program will (luckily) crash or (unluckily) corrupt the memory contents.

There obviously is a solution, or I would not write all this.

We need to keep the handle_s structure as light as possible to be able to pass it by value rather than by pointer, so let's do the following: create two structures, one of which will be a 'weak' handle, i.e., not restricted to handling a specific object, and the other one will be a 'strong' handle, i.e., one that will strongly bound to specific object and will return NULL if the 'weak' handle bound to it does not reference the same object anymore.

Let's define them like that:

struct weak_handle_s {
  volatile int version;
  volatile int reference_count;
  void object;
  void (*destroy_object)(void*);
};

struct strong_handle_s {
  struct weak_handle_s *handle;
  int version;
};

So as you see, now both handles have a 'version' field, and strong_handle_s doesn't even have the object pointer, as it is now stored in shared weak_handle_s.

Let's see how it protects us from the ABA problem shown above.

The strong_handle_s and the weak_handle_s that reference the same object have the version field equal to each other.

Whenever the handle is freed and weak_handle_s is put back to the object pool, increment the version number in the weak_handle_s.

Next time, if the weak_handle_s is acquired to handle another object, it will have version number different from the object that was freed. Now in the lock_handle function, comparing the version fields in both weak and strong handle, we can say if the weak_handle_s still references the object we mean by the strong_handle_s, and return NULL if it is not so.

So, as we can see, the handle approach brings in some quite challenging problems, but it also has its bright sides: a handle can be stored and forgotten about until we need the object that is referenced by it; a handle is non-intrusive, which means that we can use it with virtually any data types.

Also passing handle does not necessarily mean passing ownership of the object, so it can be used instead of a pointer as a safe reference to the object everywhere where it does not own the object (i.e., in data structures, etc.).

So handles are a lot more complicated to manage, but a lot more powerful.

Downsides	Benefits
Requires additional memory to manage handles	Allows storing handles and accessing objects as necessary
Requires more CPU cycles	Does not require instance ownership to ensure that an instance of a reference can be requested
Brings in some inevident problems

Conclusion

Intrusive reference counter implements a shared strong reference that has to always hold a lock until the object is guaranteed not to be needed any more. Until all references are unlocked, the object cannot be deleted. If a reference lock is unlocked by a thread that held only one reference lock, it is not safe to acquire a lock anymore, hence it is not guaranteed that the memory occupied by the object along with the reference counter it contained, is not freed.

Handle implements a shared weak reference. If the object is alive, lock_handle function will return the pointer to acquired object, and the object is guaranteed not to be deleted until the handle is unlocked. It is safe to lock and unlock handle however many times, since the handle, as a separate object, is guaranteed to be a valid memory object. Appropriate measures should be taken to ensure that the shared reference count memory is not freed whenever object reaches reference count 0, and that the re-requested reference count is not used to check reference counts in freed objects.