Explicit C++-Code for multiprocessor systems

Question

3.00/5 (1 vote)

See more:

Hello guys (and girls of course...),

we often have code structures with two or more huge arrays like:

C++

for (int i = 0; i < size1; i++) 
{
	for (int j = 0; j < size2; j++)
	{
		// compare members of array[i] with members of array[j] and do something
		// ...
		// if done break
	}
}

This can be a very time consuming process.

How is it possible to split these routines on a multiprocessor system
in a smart way, I mean in a way that processor1 runs only this specific part
and processor2 runs another one ?

Or how to split the routines in several threads and assign explicitly
thread1 to processor1 and thread2 to processor2 for example ?

Some years ago I read an article about this, but I have forgotten all.

Perhaps an idea or some examples ?

Thanks in advance and best regards.

Posted 5-Jun-13 3:46am

sja63

Add a Solution

Comments

KarstenK 5-Jun-13 9:55am

You need to "slice" the operation, so they dont interfere!!!
You need to identify the time consuming part of your loop => only thread that part "deeply in" the loop.

For instance if you sort the array it is a No-go. So it is really important what you do in the loop.

the assigning to the processors will do the operating system BETTER than you in code.

4 solutions

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Christian Graus · Accepted Answer · 2013-06-05T03:51:00

Solution 1

The only way to get code to run on more than one processor, is to run it in more than one thread, the two are the same. I don't think it's guarenteed this will happen, but farming each outer loop to a thread, is the way to increase your odds, then I think it comes down to how the app is configured and the threads are created, I don't know enough about C++ threading to give more details on that front. I'd make it use a lot of threads, then test it to see if multiple cores get busy. If they don't, look to work out why, but multiple threads has to be the only way to get many cores running, so that's where you need to start.

Posted 5-Jun-13 3:51am

Christian Graus

Comments

Sergey Alexandrovich Kryukov 5-Jun-13 11:56am

This is quite reasonable, but not 100% accurate.

In fact, the control of the usage of a particular CPU/code is possible. Please see my answer.

In practice, however, it should almost never be used. Nevertheless, I know such cases when it can help. One hint: working with a "bad" extrernal hardware system (such as in SCADA applications) requiring permanent polling. The idea is to sacrifice one code to the "bad" task, making other core usage more stable.

—SA

nv3 5-Jun-13 14:15pm

Christian, your key point is: Splitting the job up into multiple threads is a requirement; otherwise, only a single processor can work on it. And that is basically true, I guess. Hence my 5. Who does the distributing of threads to the available processors or cores is secondary in most cases. Yes, it can done by hand, but mostly the operating system does a pretty good job with that.

The biggest difficulty is in many cases how to split the job into multiple non-interdependent threads. And it can fill a book to explain how to do that.

Sergey Alexandrovich Kryukov 5-Jun-13 23:19pm

Good point, I must say.
—SA

JackDingler · Accepted Answer · 2013-06-05T06:57:00

OpenMP[^] is another way to go with this. It's supported by Visual C++ and GCC. With OpenMP, the compiler handles the details of spawning and managing threads.

Keep in mind that you still have to write your code in such a way that race conditions, locking etc..., are handled gracefully as others have already mentioned.

Also, in multi-level loops, make an effort to only split your outermost loops out into threads. Creating threads takes some overhead. Once a thread has been created, make sure it has plenty to do.

Spawning many threads for small operations, will likely slow your code down.

sja63 · Accepted Answer · 2013-06-05T09:41:00

After work I found the mentioned article about
new multithreading C++11 features.

The author suggests in an example the use of
std::async like this:

C++

// Sequential processing
int func1(int start, int end)
{
	for (int i=start; i<end;i++)>
	{
		// processing...
	}
	return result;
}

// Multi thread processing
int N = GetCPUNum();

int func2(int start, int end)
{
	return std::async(func1(start, (end-start)/N))+
		std::async(func1(start+(end-start)/N,start+2*(end-start)/N))+
			...
}

i.e the division of one sequential function call in N multi-threaded function
calls.

The article is written by Rainer Grimm and one can find in the net an
interesting C++11 OverView as pdf file.

Best regards

Sergey Alexandrovich Kryukov · Accepted Answer · 2013-06-05T05:53:00

Solution 1 is not quite accurate. In fact, you actually can control the usage of physical CPU/core by your code. To do so, you should use such feature as thread affinity or process affinity. This is how:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686247%28v=vs.85%29.aspx[^],
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686223%28v=vs.85%29.aspx[^].

—SA

Explicit C++-Code for multiprocessor systems

4 solutions

Solution 1

Solution 3

Solution 4

Solution 2

Add your solution here

Preview 0