Friday, March 20, 2009

Modular Exponentiation

I have a post cooking on wheel factorization, yet another sieving algorithm for the computation of all primes below a given number n. It turns out that while implementing it, one comes across a number of not so obvious preliminary questions. It is a rule of this blog not to resort to any non standard library, and I feel it would be breaking the spirit of that rule to post code which hasn't been thoroughly explained in advance. So in order not to make that soon-to-be post on prime numbers longer than anyone could bear, I find myself in need of covering a couple of preliminaries, and so here comes this thrilling post on modular exponentiation...

What we are after is determining ab (mod m), or to put it in plain English, the remainder of a to the power of b, divided by m. Doesn't seem much, does it? We could just go ahead and write:

def modExp1(a, b, m) :
"""Computes a to the power b, modulo m, directly
return a**b % m

The problems begin if a and b are both big. Python's integers don't overflow, and we all love it for it, but the memory juggling it has to do to accommodate a huge number in memory does come at a speed cost. Specially if m is small, it definitely seems a little overkill to keep a zillion digit number in memory, given that the answer will be smaller than m...

We can, and will, use modular arithmetic to our advantage. See, wikipedia says that "the congruence relation (introduced by modular arithmetic) on the integers is compatible with the multiplication operation of the ring of integers," and if it's on the internet it has to be true, right? Translated to plain English, the idea is that if we want to take the product of two numbers, a and b, modulo m, we can first multiply them, and then take the m-modulo of the result, or we can take the m-modulo of a and/or b, prior to multiplication, and the m-modulo of the result will still be the right answer. So the least a conscious programmer should do is to rewrite the previous function as:

def modExp2(a, b, m) :
"""Computes a to the power b, modulo m, a little less directly
return (a%m)**b % m

Of course, if b is large enough, you may still find yourself in the realm of zillion digit numbers, which is a place we don't really want to go. It turns out then that it may be a good option to remember that exponentiation is just repeated multiplication. If we multiply a by itself b times, and take the m-modulo of the result, the larger number we will ever have in memory will be m2. It doesn't really seem right to put a loop of b iterations into the code, since what we are worried about are large b's, but lets do it anyway:

def modExp3(a, b, m) :
"""Computes a to the power b, modulo m, by iterative multiplication
ret = 1
a %= m
while b :
ret *= a
ret %= m
b -= 1
return ret

The only reason why this last function deserves a place here, is because it helps introduce the really cool algorithm that this whole post really is about, binary exponentiation. In John D. Cook's blog there is a great explanation on how it could be implemented iteratively, based on the binary representation of the exponent, but I'll go for a recursive version, which is essentially the same, but which I find easier for my brain's wiring...

So what we are about to do is, when asked to compute ab, check if b is even, in which case we will compute it as the square of ab/2. If b is odd, we will compute it as a ab-1. If b is 0, we will of course return 1. If we also throw in all the modulo m stuff I have been discussing previously, we finally arrive at THE function to do modular exponentiation:

def modExp(a, b, m) :
"""Computes a to the power b, modulo m, using binary exponentiation
a %= m
ret = None

if b == 0 :
ret = 1
elif b%2 :
ret = a * modExp(a,b-1,m)
else :
ret = modExp(a,b//2,m)
ret *= ret

return ret%m

Now that we have all four versions of the same function written, we can do a little testing, I will leave writing the testing code as an exercise for the reader (I've been wanting to say this since my first year of university...), but when computing 1234 to the power 5678, modulo 90, a 1000 times with each function, these is the performance I get:

>>> testModExp(1234,5678,90,1000)
Got 46 using method 1 in 3.73399996758 sec.
Got 46 using method 2 in 0.546999931335 sec.
Got 46 using method 3 in 1.25 sec.
Got 46 using THE method in 0.0150001049042 sec.

Results which really don't require much commenting...

EDIT: There is some very interesting information in the comments: python comes with a built-in pow function, which can take two or three arguments, the third being, precisely, the modulo. It probably implements an algorithm similar to what has been presented, but is more than an order of magnitude faster.


  1. Interesting article, and interesting blog which I look forward to reading more of.

    However, correct me if I am wrong, but the functions described here are redundant for Python users.
    Python has the function
    pow(x, y[, z]) -> number

    With two arguments, equivalent to x**y. With three arguments, equivalent to (x**y) % z, but may be more efficient (e.g. for longs).

    which efficiently (in C) implements modular exponentiation.

  2. You are absolutely right. Thanks for the free python lesson... Calling pow(1234,5678,90) is about 17 times faster than calling the above's modExp(1234,5678,90).

    I actually wrote this entry thinking that, to compute the multiplicative inverse of a number modulo a prime number, it would be faster to use direct modular exponentiation than to use the extended euclidean algorithm. Well, it wasn't, but for a small factor, but I anyway discarded half the post and code I had written. I have now rechecked, and it seems my intuition was right, since at least for prime modulos smaller than ~50,000, if the modular exponentiation is done without any intermediate redundant function such as mine, direct exponentition is faster than Euclid's...

    I'm going to have to do some recoding and testing, but your comment is definitely going to require redoing (extending) the entry on the modular multiplicative inverse, and may require extensive rework of the one on wheel factorization, which may also speed up considerably...