Click here to Skip to main content
15,892,161 members
Please Sign up or sign in to vote.
3.00/5 (1 vote)
See more:
Hello guys (and girls of course...),

we often have code structures with two or more huge arrays like:

C++
for (int i = 0; i < size1; i++) 
{
	for (int j = 0; j < size2; j++)
	{
		// compare members of array[i] with members of array[j] and do something
		// ...
		// if done break
	}
}


This can be a very time consuming process.

How is it possible to split these routines on a multiprocessor system
in a smart way, I mean in a way that processor1 runs only this specific part
and processor2 runs another one ?

Or how to split the routines in several threads and assign explicitly
thread1 to processor1 and thread2 to processor2 for example ?

Some years ago I read an article about this, but I have forgotten all.

Perhaps an idea or some examples ?

Thanks in advance and best regards.
Posted
Comments
KarstenK 5-Jun-13 9:55am    
You need to "slice" the operation, so they dont interfere!!!
You need to identify the time consuming part of your loop => only thread that part "deeply in" the loop.

For instance if you sort the array it is a No-go. So it is really important what you do in the loop.

the assigning to the processors will do the operating system BETTER than you in code.

The only way to get code to run on more than one processor, is to run it in more than one thread, the two are the same. I don't think it's guarenteed this will happen, but farming each outer loop to a thread, is the way to increase your odds, then I think it comes down to how the app is configured and the threads are created, I don't know enough about C++ threading to give more details on that front. I'd make it use a lot of threads, then test it to see if multiple cores get busy. If they don't, look to work out why, but multiple threads has to be the only way to get many cores running, so that's where you need to start.
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 5-Jun-13 11:56am    
This is quite reasonable, but not 100% accurate.

In fact, the control of the usage of a particular CPU/code is possible. Please see my answer.

In practice, however, it should almost never be used. Nevertheless, I know such cases when it can help. One hint: working with a "bad" extrernal hardware system (such as in SCADA applications) requiring permanent polling. The idea is to sacrifice one code to the "bad" task, making other core usage more stable.

—SA
nv3 5-Jun-13 14:15pm    
Christian, your key point is: Splitting the job up into multiple threads is a requirement; otherwise, only a single processor can work on it. And that is basically true, I guess. Hence my 5. Who does the distributing of threads to the available processors or cores is secondary in most cases. Yes, it can done by hand, but mostly the operating system does a pretty good job with that.

The biggest difficulty is in many cases how to split the job into multiple non-interdependent threads. And it can fill a book to explain how to do that.
Sergey Alexandrovich Kryukov 5-Jun-13 23:19pm    
Good point, I must say.
—SA
OpenMP[^] is another way to go with this. It's supported by Visual C++ and GCC. With OpenMP, the compiler handles the details of spawning and managing threads.

Keep in mind that you still have to write your code in such a way that race conditions, locking etc..., are handled gracefully as others have already mentioned.

Also, in multi-level loops, make an effort to only split your outermost loops out into threads. Creating threads takes some overhead. Once a thread has been created, make sure it has plenty to do.

Spawning many threads for small operations, will likely slow your code down.
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 5-Jun-13 23:20pm    
Good one, my 5.
—SA
After work I found the mentioned article about
new multithreading C++11 features.

The author suggests in an example the use of
std::async like this:

C++
// Sequential processing
int func1(int start, int end)
{
	for (int i=start; i<end;i++)>
	{
		// processing...
	}
	return result;
}

// Multi thread processing
int N = GetCPUNum();

int func2(int start, int end)
{
	return std::async(func1(start, (end-start)/N))+
		std::async(func1(start+(end-start)/N,start+2*(end-start)/N))+
			...
}


i.e the division of one sequential function call in N multi-threaded function
calls.

The article is written by Rainer Grimm and one can find in the net an
interesting C++11 OverView as pdf file.

Best regards
 
Share this answer
 
Solution 1 is not quite accurate. In fact, you actually can control the usage of physical CPU/core by your code. To do so, you should use such feature as thread affinity or process affinity. This is how:
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686247%28v=vs.85%29.aspx[^],
http://msdn.microsoft.com/en-us/library/windows/desktop/ms686223%28v=vs.85%29.aspx[^].

—SA
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900