[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

fast modular reduction



During the Crypto' 95 Rump Session, Josh Benaloh of Microsoft Corp. 
presented a new modular reduction algorithm that he and I developed.  It 
is faster than the Montgomery method by about 10 to 15%, and is more 
general and easier to understand.  The central idea is that it is easy to 
reduce a number to an equivalent one that's just one "block" (machine 
word) longer than the modulus, by repeatedly subtracting off the highest 
block, and adding back something that's equivalent, but smaller.

In the following pseudocode, B is the radix in which the numbers are 
represented (2^32 for a 32-bit machine), n is the length of modulus in 
blocks, U is B^(n+1) mod the modulus, X is the number to be reduced, k+1 
is the length of X, and Y is the result.

1. Y = X
2. For i from k down to n+1, repeat steps 3 and 4
3.	Y = Y - Y[i] * B^i + Y[i] * U * B^(i-n-1)
4.	If Y >= B^i, then Y = Y - B^i + U * B^(i-n-1)

Tricks can be used to eliminate step 4, and to reduce Y to n blocks using 
one single precision division, and n more single precision 
multiplications.  The algorithm will hopefully be written up more 
completely soon.

Wei Dai