|
Slightly off-topic but I was working on a little additive synthesis hobby application recently where I needed to sum together hundreds of sine waves 48,000 times a second. I spent quite a while getting it as fast as I could.
First observation is that computers are really blood quick these days.
Obviously Math.Sin was too slow so used a wavetable.
Floats were too slow so used 32-bit signed integers (this seems to vary by processor quite a lot)
Arrays have bound checking so moved into the unsafe domain
Repeated the actual business code inside the loop (copy and paste) so less iterations needed (better setting to looping ratio + to avoid pipeline flushes on processor)
But, the killer improvement I got was not to use member variables in the loops. By copying member data to local variables, doing all the calculations with this then updating member variables from the locals at the end it speeded up several fold. I think this is because the locals were assigned to a register and didn't need to be written back and forth to memory continually.
I can't remember what the stats were but they were outstanding, perhaps something in the order of 50,000 oscillators at 48KHz.
Regards,
Rob Philpott.
|
|
|
|
|
Yes local variables can make a big difference - but Release optimisations should do that for you anyway, in theory.
Computers are indeed damn quick these days, but it's still true that a good, experienced coder can beat the compiler sometimes!
Those who fail to learn history are doomed to repeat it. --- George Santayana (December 16, 1863 – September 26, 1952)
Those who fail to clear history are doomed to explain it. --- OriginalGriff (February 24, 1959 – ∞)
|
|
|
|
|
The whole release/debug thing is pretty blurred in my mind in .NET. I'm not sure that optimization can be made due the hideously complex business of multiple cores with caching at various levels.
I actually need to get to the bottom of the exact differences between debug and release. In C++ it was very clear, but it is much more confusing with 2 stage compilation. Debug code may have NOPs in it, but the jitter can just not bother do anything with these. I think. Really apart from the odd optimization the MSIL should be the same in either case.
There might be an article in there somewhere..
Regards,
Rob Philpott.
|
|
|
|
|
Rob Philpott wrote: apart from the odd optimization the MSIL should be the same in either case.
Nope, not even close: http://www.hanselman.com/blog/ReleaseISNOTDebug64bitOptimizationsAndCMethodInliningInReleaseBuildCallStacks.aspx[^] and that's just an investigate into inlining! The loop optimization is reportedly very good, and so is the localization of variables.
This is one reason why it's important to do any performance timing / tuning against Release builds rather than debug - because the compiler can easily remove two days work shaving a couple of seconds off!
Those who fail to learn history are doomed to repeat it. --- George Santayana (December 16, 1863 – September 26, 1952)
Those who fail to clear history are doomed to explain it. --- OriginalGriff (February 24, 1959 – ∞)
|
|
|
|
|
Interesting read - thanks.
>An anti-pattern is "a pattern that tells how to go from a problem to a bad solution."
I like that definition.
Yes, I need to do my homework here. I may share it one day!
Regards,
Rob Philpott.
|
|
|
|
|
OriginalGriff wrote: it's still true that a good, experienced coder can beat the compiler sometimes
Whilst I'm rambling it became clear recently that assembly can considerably beat compilers. I was playing around with the ARM toolchain after a 20 year absence from ARM world. In that time conventions (forgotten what they are called) have been invented to preserve registers when branching from one function to another to allow for compiler interoperability.
Net effect of this, is that before a branch the compiler pushes registers to the stack in case the branch method does anything with them then pops them coming back. If that register isn't used its a needless operation. So my little blinking LED flashed about 5 times as quick with native ARM vs. that of the compiler.
I was all for writing everything is assembler for at least quarter of an hour after that!
Regards,
Rob Philpott.
|
|
|
|
|
Oh gawd yes! IMO a good assembler programmer will get smaller, faster code than any compiler - but it will take considerably longer to write it!
That's the problem of course: and why I try to only use assembler when I have to: time critical code (mostly interrupts) and space critical applications. It just takes to long to write and maintain otherwise.
Those who fail to learn history are doomed to repeat it. --- George Santayana (December 16, 1863 – September 26, 1952)
Those who fail to clear history are doomed to explain it. --- OriginalGriff (February 24, 1959 – ∞)
|
|
|
|
|
Rob Philpott wrote: By copying member data to local variables, doing all the calculations with this then updating member variables from the locals at the end it speeded up several fold. Wow, that's a great tip! Thanks!!
Whether I think I can, or think I can't, I am always bloody right!
|
|
|
|
|
Can you say what the nature of what you're doing in the loop is?
Regards,
Rob Philpott.
|
|
|
|
|
Actually I haven't written those modules, I have only been asked to refactor those, but what I can see, it's similar to a heavy pipelined processing where processing order is important. I don't see anything fancy here at all, but yet to debug it thoroughly, so can't say much.
Whether I think I can, or think I can't, I am always bloody right!
|
|
|
|
|
Fair enough. Normally when I do this sort of stuff I just call the 'business' part of the operation in a thundering great loop (1000x, 1000000x operations etc.) and time it with Diagnostics.Stopwatch. You can just then tinker away and try different things out and see the difference it makes.
Regards,
Rob Philpott.
|
|
|
|
|
Sure, thank you!
Whether I think I can, or think I can't, I am always bloody right!
|
|
|
|
|
One question that I would look to answer - are there any items that could be offloaded to the GPU. I've had a lot of success offloading mathematically complex ideas here in the past.
|
|
|
|
|
I hope there is any scope for that, currently debugging the code to understand what exactly it does. I am not the one who has written those modules, have no idea whatsoever about the project and still they are forcing me to refactor and optimize. .
Thanks for you suggestion.
Whether I think I can, or think I can't, I am always bloody right!
|
|
|
|
|
Before you do ANY refactoring, make sure that the code is covered with meaningful unit tests. It will make your refactoring a lot easier if you can test each change against a repeatable, meaningful set of control values.
|
|
|
|
|
Sure, thank you.
Whether I think I can, or think I can't, I am always bloody right!
|
|
|
|
|
Step one is to examine the algorithm itself and consider completely rewriting it. If performance is absolute, consider creating a native DLL. It does without saying to examine the use of the heap in all of this.
|
|
|
|
|
Hello from Mexico, an apology for my English is not very good.
I have a x628c team was doing some testing and delete my punches with my personal ClearGLog function and then when I wanted to retrieve the information recueperaba but with all my personal ID to 0.
I would ask if there is any other way to remove labels x628c team and if there is way to remove the labels depending on some date.
Beforehand thank you very much for reading this.
I am using the dll zkemkeeper and programming in c #
|
|
|
|
|
I'm sorry, but that means absolutely nothing to me.
Perhaps if you give us an example, it might help a little?
Those who fail to learn history are doomed to repeat it. --- George Santayana (December 16, 1863 – September 26, 1952)
Those who fail to clear history are doomed to explain it. --- OriginalGriff (February 24, 1959 – ∞)
|
|
|
|
|
What's a "x628c"??
You're saying all of this assuming everyone knows what you're talking about and what you're doing. If you don't describe everything we have no context to give any advise with.
You really need to read the links in my signature.
|
|
|
|
|
when we are using winform to index data through lucene.net then think say 10 same win apps is running in a office and they all index data. if the index segment file is created in every pc then it will be problem or we can create in any centralize pc but problem occur when that pc suddenly being unavailable from network. so i like to know can we store lucene.net index in sql server for centralize access. if possible please guide me. i search lot for storing lucene.net index in sql server but found none. i got a article on java and they i saw it is possible. here is the link http://kalanir.blogspot.in/2008/06/creating-search-index-in-database.html
they said Lucence contains the JdbcDirectory interface for this purpose but i am working with lucene.net & c# so please anyone guide me how make it possible to store lucene index in sql server.
thanks
tbhattacharjee
|
|
|
|
|
Tridip Bhattacharjee wrote: can create in any centralize pc but problem occur when that pc suddenly being unavailable from network. ..you have the same problem with SQL Server. Simply install it on a centralized problem, and make sure it doesn't become unavailable - just as your SQL Server.
Bastard Programmer from Hell
If you can't read my code, try converting it here[^]
|
|
|
|
|
hello
C# sql server column count read & sql server code
|
|
|
|
|
And what exactly is your question?
|
|
|
|
|
Can you please explain it more clearly?
|
|
|
|