|
What kind of "bitmap" do you mean?
The fastest way (excluding straight table lookup of course, but that only works well if the table is in cache) to count bits in an integer (without using the popcnt instruction, which is not commonly supported) is this: http://stackoverflow.com/posts/1511920/revisions[^]
It's based on this: http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel[^]
edit: ok to clear the mess I made here up a bit..
There are many ways, including:
- popcnt: not supported by enough CPU's yet
- table lookup: only works well if you can keep the table in cache until you don't have to count any bits anymore, a cache miss is many time more terrible even than using the "standard" way (so if you have to count a lots of bits in a tight loop, go for it)
- count the bits one by one[^], works well if you expect few bits to be set (or reset - just take the complement and subtract the count from the length)
- count the bits in parallel (see links in the first part of my post) - it has no bad case, making it a safe choice. It simply uses a fixed number of steps, without needing big tables.
modified on Thursday, June 24, 2010 11:22 AM
|
|
|
|
|
Good links, thanks.
All bitmaps are mono.
I was thinking of taking the bits on a boolean swim to upper left of bm using bitblt()and then binary look for first zero row, etc. I don't have a swim algorihm though.
Does that get anyone's wheels turning?
*********START EDIT
Um, assume theoretical mono bitmaps so we avoid platform dependency and speed discussion. I can always xlate to MS at code time.
*********END EDIT
Tadeusz Westawic
Sum quid sum.
-- Modified Thursday, June 24, 2010 12:28 PM
|
|
|
|
|
Sorry, I have no idea what a swim algorithm is, google isn't being very helpful either..
|
|
|
|
|
A "theoretical mono bitmap"? How does it work? Just a grid of bools of unknown format - so tricks are out?
In that case you can't do any better than just test each pixel and increment a counter if the pixel is "true", that makes the question irrelevant so that probably isn't what you meant.
|
|
|
|
|
OK. "Swim" is my term to describe the motion of the bits in the bitmap (my wheel unsticks as I write).
In my actual code for other bitmap projects tricks are IN but only generic boolean blitting, no graphics languages per se but certainly take what a device interface gives. All need know are bm dims.
"Swim" would blit the lower portion of bm against the upper in binary shrinking row count.
Take OR blit to 'swim' all the bottom half bits to upper half bitmap.
Take AND on same two bitmaps to preserve the bits that could not swim due to collision with bits already in upper half.
.
.
Too fuzzy yet.
Tadeusz Westawic
Sum quid sum.
|
|
|
|
|
Well, that does not make sense to me. And you still haven't told why you would need a bit count.
|
|
|
|
|
What and why I count could be for any reason that yields meaningful results upon well-defined processing.
Can you do parallel swimming adders?
The generic term is 'collision counting' it is first related to 'collision detection'. Although the old raster arcade games employed these techniques extensively my work is not about games and accomodation in Windows is paltry; so lets say analysis where you want to count hits by vectors against a surface (in simplest form but my work is more mundane).
So, this problem implies we have some loci in discrete 3-space: the surface under test, and one each also for each projectile path, lets say 16 path locii.
Immediately map the discrete loci to congruent 2-space bitmaps, call the size (w x h).
Actuate three new bitmaps bmSRC, bmTGT and bmRSLT each (16w x 16h).
Tile bmSRC with 16 identical copies of the surface bm.
Tile bmTGT as mosaic of the 16 paths.
AND the bms into bmRSLT
The number of bits set in bmRSLT represents the number of paths intersecting the surface.
******************************
That looks like I solved 16 sets of simultaneous equations in parallel but if one needs to know "WHERE?" the collisions occurred then one is in need of an algorithm. So one does not employ thes techniques if one needs to know "WHERE?", but when the yes/no answer to "IF" will suffice.
You would also normally avoid "HOW MANY?", but in the case of a running model, one needs to benchmark and perform some "reality check" routines and count the actual number of bits to compare against a statistical projection.
Blit happens.
Tadeusz Westawic
Sum quid sum.
|
|
|
|
|
OK. It sounds like your bits are sparse then, so I'd go for a sequential count, probably like this (C#):
public int CountBits(uint[] data) {
int count=0;
foreach (uint dd in data) {
uint d=dd;
while(d!=0) {
count++;
d&=d-1;
}
}
return count;
}
or a table lookup, which works fine for bytes and maybe shorts (for 32-bit data you'd have to split in two or four bitgroups to keep the table small and cachable):
int[] bitCountInByte=new int[256];
public int CountBits(byte[] data) {
int count=0;
foreach (byte d in data) {
if (d!=0) count+=bitCountInByte[d];
}
return count;
}
|
|
|
|
|
I see.. I think.
If you also add NOT then you could build an adder out of that, I suppose..
That's a little weird though
|
|
|
|
|
|
No, not that way, although you should not lose your idea, it is applicable elsewhere especially using "color" bitmaps.
The room proctor wishes I explain myself so I have to go qualify for next level or some such. Patience,
****BEGIN EDIT
Consider 1-dimensional bm 16 bits wide as {0011 1011 0100 0110} nb 8 bits are set
OR low-order on high-order to get {0111 1111 dont care low-order 8 bits} nb 7 bits are set
AND low on hi to get {0000 0010} nb 1 bit is set, the one missing from the OR
The OR yields an undercount which is tallied by the AND.
****END EDIT
Tadeusz Westawic
Sum quid sum.
-- Modified Thursday, June 24, 2010 4:40 PM
|
|
|
|
|
Ok, but then it doesn't help at all - you then have two pieces that are both half the original length, but together they're just as many bits to count as you had before.
|
|
|
|
|
Yah, that's why its fuzzy and I call swim because the lower half now has to shift and try again, or something.
I truly dk, this is one that I never mastered. I thought my post would generate interest but I hoped there might be someone else out there active with this sort of thing. More specifically, someone active in this particular forum who is doing the same nonsense I am. I notice here there dwell engineers, most sites are full of k-12 folks but my math is weak for formally describing what is going-on in my work.
I have parallel rotate xforms for discrete 8x8x8 manifolds, parallel collision and collision detectors for the same but no bit counter.
Tadeusz Westawic
Sum quid sum.
|
|
|
|
|
Hi,
the AND and OR stuff doesn't bring you a step closer; consider two bits A and B, their AND and OR:
A B AND OR
0 0 0 0
1 1 1 1
0 1 0 1
1 0 0 1
So, yes, the number of bits set remains the same when you replace A and B by (A AND B) and (A OR B). In fact there is only one out of four lines that shows a difference. But so what? it still is two bits that need to be counted.
There are several ways to count bits in a word; I have shown you the obvious ones; Harold has provided a complex mask-and-shift one; there are variations on those, especially when not all bits in a word can be set (say you have a guaranteed zero in every odd position (0a0b0c...). But none of those will be remarkably faster overall.
|
|
|
|
|
I think I have failed to impress the nature of the problem and the constraint that it be a bitmap. Too bad.
There are very good links from aptroot. And his links may indeed take me away.
Your supposition that the OR and AND at the top of a swim loop buy nothing is wrong.
No one has yet suggested a CA solution.
Oh yeah, my notion of "swim" came from feeding fish in a tank: if you sprinkle the food in a corner they all travel from random locations in the tank to the corner with the food.
Tadeusz Westawic
Sum quid sum.
|
|
|
|
|
Harold,
Seems that SSE instructions (PUNPCCKLBW) could be used to split up BYTES to WORDS, then the WORDS (actually LoByte of the word) used to table lookup to convert to a count, then packed add could be used to accumulate these counts, then further split to DWORDS and repeat. It is amazing what speedups can be seen with SSE conversions of simple things like string compare and strchr.
Dave.
|
|
|
|
|
Probably, but SSE has POPCNT as well.. ok that's SSE4, not just SSE2 like PUNPCKLBW (which I assume you mean)
I didn't 1-vote you, btw.
|
|
|
|
|
Harold,
Yes, I was referring to SSE2. Almost everyone has it. And, I just never worry about the voting, but it seems seems other Helpful Harrys have bumped it up.
Dave.
|
|
|
|
|
in my experience, a table lookup can go a long way, using either an 8-bit or a 16-bit index, and repeating for each 8/16-bit fraction of the data.
And then, are you sure you need the count? and nothing else?
chances are counting the bits is bus bandwidth limited already, so combining it with another operation may pay off big time.
|
|
|
|
|
Anyone know a desktop calculator that takes mixed hex and decimal inputs?
Tadeusz Westawic
An ounce of Clever is worth a pound of Experience.
|
|
|
|
|
Windows 7 calculator does this, and I know there are other freebies out there. Google will tell you - ask for programmer calculator.
It's time for a new signature.
|
|
|
|
|
new sig
Tadeusz Westawic
Caveat juris excretorium.
|
|
|
|
|
Tadeusz Westawic wrote: mind is failing in old age
Sounds like me!
It's time for a new signature.
|
|
|
|
|
In my opinion Google is a good way to study!
|
|
|
|
|
where is the algorithm question?
|
|
|
|