|
Totaly correct!
But in these times with an average cpu power of 2 GHz it is not really neccesary to 'bitfuck'. The problem comes when you go programming on external devices, though today most of the pda's and cell phones only have a lack of memory in stead of cpu power.
It's getting tough when you program on microcontrollers, but then you shouldn't uing C#, but assembler!!
|
|
|
|
|
Some forms of "bitfucking" are becoming more important though, since with the widening CPU-RAM speed gap it becomes increasingly important to not address more memory in inner loops than the size of the L2 cache (needing less if of course even better), if "bitfucks" are needed to accomplish that then so be it.
And since the conditions for store-forwarding are very restrictive, extracting smaller types from within larger types at a non-aligned point should always be done with a "bitfuck" - performance will suck if you write it to memory and read a smaller and unaligned part of it back. The reverse, inserting a small type into a large type is even worse, it is never store-forwarded so "bitfucking" is always needed unless the code is not in a performance critical section.
|
|
|
|
|
You are true, and you always have to bitfuck wherever you can I agree with that. Maybe it's because I like it, maybe because I started with microcontrollers.
But if we go and calculate, we will see that it doesn't really matter which types you will use.
An average to small cache size is 1 Gb, this means. Let's say your program will use half of it so this leaves you to use 500 M of bytes. If you're an efficient programmer you can never make your program to use the whole 500 M. Even if you store numbers which you could put into a byte in an Int128 it'll take you an array of (500/16 =~) 30 million to fill it.
The only way to fill your RAM is when you're busy with graphics. And even then it's not neccecary to bitfuck, because Microsoft made some good library's which will all take care of the memory problem.
In conclusion I think we can say that bitfucking is fun, but not really neccesary if you're just making efficient code.
|
|
|
|
|
Deresen wrote: average to small cache size is 1 Gb
Where did you get that information? The biggest cache size I've seen so far is 12MB (as 2x6MB) of L2. That is not so much, and every so often an other program will come along and trash it (if you don't do it yourself)
|
|
|
|
|
My mistake, I was thinking about RAM memory. <shame on="">
|
|
|
|
|
Well then 1GB is indeed small
But RAM is slow (compared to the CPU), a cache miss can easily cost 100 cycles - long enough to justify doing complex calculations just to avoid the cache miss, plenty of time for 150 to 250 instructions
That makes me wonder what the theoretical maximum of instructions in 100 cycles is (on Core2)
500 if looking only at the predecoder specs: macro fusion can fuse 2 instructions but can only be done once per cycle, so 5 instructions (3 of which must be 1 µop) and the size of such a "block" should satisfy N*size = 16 (to Never cross a boundary) and no 66H or 67H prefixes should occur anywhere
But then looking at the rest: the sequence must not have any dependacy chains, not all instructions are perfectly pipelined, there are only 3 "normal" ports (0, 1 and 5) and even register reading can be a bottleneck.
Only 6 µops per cycle are allowed, but that includes memory read/write (bringing us down to 400 except for µop fusion)
The predecoder throughput would be less important (only the first iteration) if we were executing a small (less than 4 times 16 bytes) loop..
And I'm not even going to mention the rest.
The best throughput of any one instruction is 3/cycle (a stream of NOP's for example) so it should be possible to do (slightly) better than that, right?
This is too complex, I'll leave it to the pro's.
|
|
|
|
|
I don't know about faster but constants would increase maintainability.
While there is no speed decrease it is poor practice to use strings as a switch variable.
|
|
|
|
|
Downcasting is often a 0 step operation (no operation actually, but just start using less bits of the register)
It won't become faster though, even operations on 128bits at a time have the same latency and throughput. On Core2 at least.
note: the following part is based on speculation
So would code would the JIT compiler generate for a switch instruction in MSIL? With a bit of bad luck it will generate something like: (assume switch variable is in eax)
mov edx,SwitchTabelBase ;or any other register that can be used as base
jmp [edx+4*eax]
Bad luck because that would mean it will need an extra operation if it sees a downcast to a byte first, it can't just assume that the cast is a nop, unless it would do some expensive analyzes it can't know that the value won't exceed 255 anyway. So it would have to do:
movzx eax,al
mov edx,SwitchTabelBase ;or any other register that can be used as base
jmp [edx+4*eax]
Or possibly:
and eax,0xFF
mov edx,SwitchTabelBase ;or any other register that can be used as base
jmp [edx+4*eax]
Or something else? Who knows? How can you disassemble the code after it's JIT-compiled anyway?
However! (speculations end here)
A switch instruction (in MSIL) is only generated when the resulting table would be dense, when it isn't generated, a 'tree' of if's is generated instead (or a chain of if's as a special case, which the .NET reflector doesn't understand). It isn't really a tree though, it's a mess of if's and labels (in a linear way), but the data-flow as "as though it were a tree of if's". It does the expected thing - split the range of values in 2 every time. Obviously this is a O(log n) algorithm so beware, switch doesn't always perform O(1).
edit: the whole point for this was to note that:
The size of the operand doesn't matter for speed, unless it's bigger than 64bits, because these are comparisons, and a 64bit comparison can be done as fast as any smaller comparison. 32bit if running in 32bit mode. The 128bits thing only works when working with SSE, which the .NET JIT compiler doesn't use, except for FISTTP which is technically an SSE instruction but it works with the regular floating point stack.
For switches on strings it's a whole different story - if a real switch is used, all possible values are put into a dictionary every time again the dictionary is not saved. Otherwise it generated a chain of if's, using op_Equality (aka ==) between the value and every case. Both algorithms are O(n) in the number of cases, but the chain of if's at least has an early exit and doesn't create a (possibly big) dictionary - the downside is that equality test may be slow, depending on the situation. Last modified: 1hr 46mins after originally posted -- I forgot to include The Point, lol
|
|
|
|
|
Aw no comments? It took me quite a while to do the required research..
|
|
|
|
|
Minor optimisation tricks like that are something that your compiler should (and probably is) is handling for you.
Write clear and readable code.
|
|
|
|
|
|
Alas, it isn't (but it should!).
|
|
|
|
|
Hi
I need help.
Have drag-and-drop for the components eg. dataset, bindingsource and table adpters.
There is now a default connection string in my app.config file.
The user can now ,using FileOpenDialog on the main form, choose a different database if they want. I simply take this path as a public static string and assign it as the new connectionString. obviously now the dataset must change accordingly. which is does on this main form.
but as soon as i load a child form and still want to use this new dataset obviously, the connection string is changed in the designer code of the tableAdapter.fill() method back to the default connectionstring in the app.config file.
Declaring my connection string as a public static variable worked perfectly in another project where i actually programmatically created my dataset,bindingsource and table adapters.
Is this one of the limitations one has when using drag and drop components??
Please if someone could help me..
|
|
|
|
|
Hi,
I am building a checkers game and I would like to have an option of saving and loading the game. I need to save the game state meaning pictureboxes, arrays, stacks etc... I would be glad to be directed to some article that explains how this could be done.
Thank you
|
|
|
|
|
See, for instance [^].
If the Lord God Almighty had consulted me before embarking upon the Creation, I would have recommended something simpler.
-- Alfonso the Wise, 13th Century King of Castile.
This is going on my arrogant assumptions. You may have a superb reason why I'm completely wrong.
-- Iain Clarke
[My articles]
|
|
|
|
|
Hi Bar3000,
I would generally look down the route of Serialization and DeSerialization of a object(s), that will hold you Games details.
These can be implemented using the ISerializable interface.
Cheers,
Paul
|
|
|
|
|
Perhapse better than implementing ISerializable, which is extra work that really is not neccesary...storing your game state in a separate serializable object that is utilized by your game engine would be a better approach. That way, you put all your game state (i.e. a matrix that maintains the state of each cell on the board, etc.) in a single class. That class can then be serialized without the worry of serializing other game state.
|
|
|
|
|
Hi,
I reckon checkers state is rather limited: to me it is what you see on the board (50 squares being empty, or taken by white, black, white queen, black queen) plus ply (=half move) number.
Internally I would be inclined to store the board state in bitmaps (not the GDI ones!_, being long integers where each bit represents one square. Possible variables might be: white, black, queen, and empty (which would be redundant but might speed up things).
The advantage of bitmaps is they allow you to generate possible moves in parallel by:
- performing shifts, e.g. (newState = oldState << 1) could be a forward move right
- then mask or AND to discard moves to non-empty squares, etc.
So what is keeping you from saving/restoring those few longs?
BTW: I would NEVER use PictureBoxes to render the board of a board game. I'd rather use a single Panel
and paint the squares and pieces myself in its Paint handler.
Luc Pattyn [Forum Guidelines] [My Articles]
- before you ask a question here, search CodeProject, then Google
- the quality and detail of your question reflects on the effectiveness of the help you are likely to get
- use the code block button (PRE tags) to preserve formatting when showing multi-line code snippets
modified on Sunday, June 12, 2011 8:36 AM
|
|
|
|
|
Luc Pattyn wrote: I'd rather use a single Panel
I just use buttons, but I don't need to move anything around.
|
|
|
|
|
Hey where can I get tutorials on VSTO developing add ins for Outlook C# using Visual Studio 2005 Professional??? I have been searching lately and couldn't find any... Any tutorials on the internet??? for a beginneR ...
Thanks
|
|
|
|
|
|
I'm not looking to do a specific thing, rather I would like an introductory tutorial to Outlook VSTO developing.
|
|
|
|
|
Hello all,
I'm in the process of moving the system I work on from C++/MFC/COM to the .NET world. As usual, a lot of legacy code must remain as is client-wise ...
The dilemma I'm facing is the following:
I'm using a sort of home-made IOC framework to instantiate and hold a singleton .NET component. For performance reasons (avoid context switching) I would like to expose it to a native C++ client which till today instantiated the 'good old' COM component (replaced by the .NET component) using CoCreate...
I cannot touch the client's code, and it needs to interact with the .NET component already instantiated.
Any ideas?
Will registering with the ROT do the trick?
Thanks,
Omer
|
|
|
|
|
Hi all,
I want to be much more specific in replacing something in a string. The thing is that the .NET version of String.Replace does not enable one to specify a position.
So is there a String method that can do this?
Example:
string s = "They say he carved it himself...from a BIGGER spoon";
string s2 = "find your soul-mate, Homer.";
s.replace( 32, s2.length(), s2 );
cout << s << endl;
many thanks in advance
Kind regards,
The only programmers that are better C# programmers, are those who look like this -> |
Programm3r
My Blog: ^_^
|
|
|
|
|
.replace will replace substring s3 of string s1 with string s2.
example
string s1 = "Hello Dumb World"; ;
string s2 = "Beautiful";
s2 = s1.Replace("Dumb", s2);
s2 should return "Hello Beautiful World"
|
|
|
|
|