|The problem with my custom threadpool (fun activity, not commercial) is this:
I have, say 5 threads running, a global counter of how many pieces of work there are, and a work event which should be unsignaled (reset) when there is no work, and signaled (set) when there is work, because when there is no work, the threads simply wait on this event. I have two methods PushWork() and PopWork(). PushWork() increments the global counter by 1 using an interlocked operation, and checks "if I just incremented the global counter to 1, SetEvent()". PopWork() decrements the global counter by 1 using an interlocked operation, and checks "if I just decremented the global counter to 0, ResetEvent() and wait on the event".
The problem is a race condition. Imagine if the work queue is currently empty.
PushWork() gets called, and increments the counter to 1.
Then, PopWork() gets called and decrements the counter to 0.
PopWork(), running faster than PushWork() (for whatever reason) resets the event.
PushWork(), chugging along, then sets the event.
Now, we have a no more work in the queue, but the event is set, so the threads keep polling the queue for work. There is a similar problem for the other way around. Imagine if the work queue currently has one item.
PopWork() gets called, and decrements the counter to 0.
Then, PushWork() gets called and increments the counter to 1.
PushWork() sets the event.
PopWork() resets the event.
Now, there are work items in the queue, but the threads will stop running as soon as it hits the event.
The reason why this can occur is because modifying the counter and setting/resetting the event are not atomic. I could simply wrap both of these with a critical section, but I think the overhead is too big, since it means that for every piece of work, there are two EnterCriticalSection() calls and two LeaveCriticalSection() calls, one for each of push and pop. I've been trying to find a way to do this without locks. Does anyone have any ideas?
INFO: I use a combination of global work queues and local work queues (one per thread) as per modern threadpooling implementations with work-stealing queues. I do, however, have one global counter for all the work. I've entertained the idea of using different counters but don't see how that would solve my problem.