[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Dr. Dobbs Dev. Update 1/5 July 94 & Schneier
> From: tim werner <[email protected]>
> The loop was really a waste when you consider that it could
> have been done in 1 instruction.
You can do better than a bit-serial loop -- though not down to
one instruction! There are a lot of very cool approaches, only
one of which I remember.
Look at the problem as that of finding the sum of n 1-bit blocks.
Well, we can easily find the sum of a single n-bit block. The
intermediate conversions are the magic part.
Let's look at an 8-bit word. How shall we get, for example, from a
sum of 4 2-bit blocks to a sum of 2 4-bit blocks? What we do is add
adjacent blocks. The block-pair sums will actually fit in three
bits, so they'll certainly fit in four without overflowing. And all
of this can be done bit-parallel using logic ops.
In C, this looks like:
int byte_ones(int a)
// hope this is correct...
{
a = (a & 0x55) + (a & 0xAA)/2; // 0x55 == 01010101b
a = (a & 0x33) + (a & 0xCC)/4; // 0x33 == 00110011b
a = (a & 0x0F) + (a & 0xF0)/16; // 0x0F == 00001111b
return a;
}
Oh, and one AND in the third line is superfluous. This is not the
fastest algorithm for this, but it's the only one I understand and
remember.
Eli [email protected]
(I won't ask why you needed a one-hot encoding in the first place...)