What's with the popularity of "Lock Free" thingies, queues, etc

Question

4.86/5 (7 votes)

See more:

Hi. Please bear with me as this is a serious question. I cannot figure out the popularity of searching for the ultimate "lock free" algorithm for queues, etc. I just finished reading an article Yet another implementation of a lock-free circular array queue[^] and followed up on some of the references and Googled some other articles, going back almost 20 years. For reference, see An Optimistic Approach to Lock-Free FIFO Queues[^]

Here's my confusion, all of the articles that address the N-producers and M-consumers end up creating some sort of loop, a while(true){} or do{}while (CAS...) block to wrap a set of instructions. Both referenced articles demonstrate this. Effectively, they replace a blocking operation with a "spin loop" that continues until the operation succeeds without contention from other processes.

Back in the '60s, we used to call this a "spin lock", protecting a block of code using a "read / modify / write" hardware instruction to "test / set" a condition. Early IBM/360s had "TS" (Test and Set). PDP-10's had "AOSE" (Add one and skip if equal to zero) allowing similar logic.

Many of the algorithms I've seen then used complicated methods to detect and inconsistent queue setup or the "ABA" problem, basically discovering a pointer has been reused and doesn't point to what you think it does anymore. Object serial numbers are usually introduced to alieviate this problem.

Now my question. Why go through all these convoluted steps? Why isn't the simpliest solution used instead? I mean I get it, you want to avoid using system calls for locking (Mutex, Conditioned Variables, et al) for the "low contention" cases, when collisions on the queue (or whatever) are rare. If folks are willing to use a "compute bound spin loop" during contention cases, why not just

// assume "m_lock_taken" is initialized to 0 (false)
...
while (!CAS(&m_lock_taken, 0, 1));
// do enqueue or dequeue now that you own the "lock"
m_lock_taken = 0;
...

Simple spin, no complex detection nor unwinding. It has all the drawbacks of a spin loop in that it chews up cpu time doing nothing until the lock is released, probably memory contention for the m_lock_taken variable, and worse, when the N+M exceeds the number of processors / cores in the system, the spinning thread is using execution cycles that could be given to the thread that owns the lock and was pre-empted by and event or the scheduler. I simply do not understand the need for all the complexity.

And if you care about chewing up cpu time, then what you want is the case where the lock is taken out quickly when there is no contention and a waitable object (Mutex, Conditioned Variable) when the rare contention happens.

It has always been my experience that when multiple threads need to access a structure / object that need more than 1 data item to be in a consistent state before the object is "valid", that sequence needs to be protected by an "interlock", blocking or looping.

All the algorithms I've seen so far end up implementing just this simple case, loop until the complex set of operations yield a consistent object state yet there are far more simple ways of doing that exact thing.

Discuss?

Posted 8-Feb-11 12:06pm

Chuck O'Toole

Updated 8-Feb-11 12:11pm

v2

Add a Solution

Comments

DaveAuld 8-Feb-11 18:11pm

Well maybe when you find the 'answer', then i am sure you would have a pretty good article on your hands covering the for and against of the different locking types / methods. :)

Chuck O'Toole 8-Feb-11 18:25pm

Suggestions or pointers to good articles would be appreciated. I see you are into industrial automation, monitoring, and control. I spent just over 10 years in real time kernels, embedded systems, and robotics. So I too have "been there, done that". And my question is really serious, I have many ingrained "truisms" based on years of experience. One of them is folks often end up re-inventing a simple wheel into an overly complex one for no benefit. Now as an academic exercise, I can see how students can benefit from the analysis, after all, it's how I learned things. Still, if I'm wrong (or just underinformed), I'd like to know what I'm missing.

Sergey Alexandrovich Kryukov 8-Feb-11 18:31pm

I presently work in the same field (+ very close computer-aided experimental physics). I cannot stop wondering how illiterate most people in the industry are in programming field are. Threading and distributed systems are in my major area of interest.
Could you contact me directly via my Web site to establish some contact? You could find my contact via my CodeProject profile.
--SA

DaveAuld 8-Feb-11 18:54pm

The problems we currently experience in the control system we are currently implementing are mainly due to overly complex code being written just in case....Quite often a back to basics approach should be followed, but it appears the devs don't like this approach!

Sergey Alexandrovich Kryukov 8-Feb-11 19:55pm

Yes, this is indeed a big problem... As I sometimes say about one of them related to device design: OK, you want to do all that primitive polling, but do it at the expense of your CPU, not mine! :-)
Sometimes going to lower level helps. We recently dug out how to work with TCP channel of DeltaTau (oh gosh!) directly and observed that with throwing all client-part software from DeltaTau so many problems dissapeared!
--SA

Sergey Alexandrovich Kryukov 8-Feb-11 18:27pm

The problems are very interesting and important, my 5 for the post. Probably you could write a discussion article on the topic where we could discuss all that -- the format of Questions/Answers is a bit too tight.
--SA

Albert Holguin 8-Feb-11 19:36pm

Great question chuck... I'm an electrical engineer myself so I'm sometimes amazed at how complicated people make certain tasks. I'm very much against complexity for the sake of complexity, if there's no value added, it should be kept simple! Def publish a follow-up if you find more information!

1 solution

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

dasblinkenlight · Answer 1 · 2011-02-09T01:53:00

Lock-free algorithms solve problems associated with the very concept of locking, not with an implementation of it. Replacing OS-supplied mutex with a spin lock does not make your algorithm lock-free: it's still locking, only with a lot more waste.

Consider your implementation of the spin lock. A thread has just swapped "1" into the m_lock_taken, and is about to do some useful work. But then the scheduler comes along, and puts that thread to sleep! The remaining threads will be waiting for m_lock_taken to become "0", but it's not going to happen until the lock owner is allowed to run again. So the remaining threads would waste a lot of CPU time while making zero progress.

In contrast to spin locks, spin loops in lock-free algorithms are designed in such a way as to make your thread spin only when a competing thread makes some progress. As the result, at least one thread from the competing bunch is guaranteed to be making some progress. This is different from spinning when a competing thread does not make progress, which is what happens in case of spin locks: the group of competing threads will be making no progress if the owner of the lock has been pre-empted.