Project Euler Problem 3: Largest prime factor
The prime factors of 13195 are 5, 7, 13 and 29.
What is the largest prime factor of the number 600851475143 ?
Version 1 - Inefficient Trial Division¶
First a function to generate all factors. We need only test candidate factors up to $n$.
from six.moves import filter, map, range, reduce
factors = lambda n: list(filter(lambda x: not n % x, range(1, n+1)))
factors(24)
This isn't really that helpful, since we are only interested in prime factors. While we can implement a primality predicate and filter out composites numbers, it isn't particularly efficient. Let's rethink our approach.
Instead of iterating over all integers up to $n$, we can just iterate over the prime numbers. But first we'll need a prime number generator. The simple Sieve of Eratosthenes (250s BCE) seems like a good place to start.
Version 2 - Trial Division with Prime Number Sieve¶
def _prime_sieve_eratosthenes(n):
composites = set()
for i in range(2, n):
if i not in composites:
yield i
composites.update(set(range(i*i, n, i)))
list(_prime_sieve_eratosthenes(47))
list(_prime_sieve_eratosthenes(15))
Note that this function returns a list of all prime numbers strictly below $n$, so even if $n$ is prime, it will not be included.
Of course, there are ways to optimize this implementation, but this will do for now.
prime_sieve = _prime_sieve_eratosthenes
Note that we need not inspect all primes up to $n$, but only those up to $\sqrt{n}$. To see why, let's assume $p \times q = n$ and that $p > \sqrt{n}$. It cannot be the case that $q > p$, since that would imply $n = p \times q > p^2 > n$. But if $q \leq p$ and $q$ divides $n$, then it would have been detected earlier anyways. Therefore, we need only consider $p \leq \sqrt{n} \Leftrightarrow p^2 \leq n$, where we have equality if $n$ is a perfect square.
from collections import Counter
def prime_factors(n):
if n < 0:
raise ValueError('Requires positive integer')
elif n == 1:
raise ValueError('1 is neither prime nor composite and has no prime factors')
else:
prime_factors_ = Counter()
for p in prime_sieve(int(n**0.5)+1):
while n % p == 0:
prime_factors_[p] += 1
n /= p
if n > 1: # n has no further prime factors
prime_factors_[n] = 1
return prime_factors_
In accodance with the above, the last element of prime_sieve(int(n**0.5)+1)
will be the largest prime $p$ such that $p < \lfloor \sqrt{n} \rfloor + 1 \Leftrightarrow p \leq \lfloor \sqrt{n} \rfloor \Leftrightarrow p \leq \sqrt{n}$ since $p$ must be an integer.
prime_factors(8)
prime_factors(15)
prime_factors(47)
prime_factors(2)
prime_factors(1)
prime_factors(6)
prime_factors(1883)
prime_factors(13195)
a = prime_factors(600851475143); a
max(a.keys())
Note that we could have just used a list to keep track of the prime factors, or even a set if we don't care about the number of appearances of each prime factor. However, given that we do, the best option was to use a Counter
from Python's collections
library. Using it has benefits beyond cleaner code, as we shall see.
Recall that the least common multiple (LCM) and greatest common divisor (GCD) of integers can both be obtained through prime factorization of the integers. For the GCD, this is simply the "intersection" of the prime factors, while for LCM, this is simply the "union".
Thus, we can directly make use of a Counter
's default mathematical operations it supports. Namely, the following:
c = Counter(a=3, b=1)
d = Counter(a=1, b=2)
c & d # intersection: min(c[x], d[x])
c | d # union: max(c[x], d[x])
Therefore, we can directly use prime_factor
to implement the lcm
and gcd
functions.
# First lets create a function to reconstruct
# the original integer from its prime factors
reconstruct = lambda c: reduce(lambda x, y: x*y, map(lambda x: x**c[x], c))
prime_factors(48)
reconstruct(prime_factors(48))
reconstruct(prime_factors(165)) == 165
Looking good. Now let's implement lcm
and gcd
.
lcm = lambda *ns: reconstruct(reduce(lambda n, m: n | m, map(prime_factors, ns)))
gcd = lambda *ns: reconstruct(reduce(lambda n, m: n & m, map(prime_factors, ns)))
lcm(48, 180)
gcd(48, 180)
To walk through what this is doing:
c = prime_factors(48); c
d = prime_factors(180); d
c | d
c & d
reconstruct(c | d)
reconstruct(c & d)
All done! We must note, however, that since our integer factorization isn't particularly efficient, and even worse, that there is no known general efficient algorithm for integer factorization, this is not the best way to compute the gcd
and lcm
for large numbers. I only implement it like this here to demonstrate further applications of prime_factors
by taking advantage of Python's Counter
. Note that the lcm
can be obtained by having the gcd
and there are many efficient algorithms for computing the gcd
, such as the Euclidean algorithm. We shall implement them later.
Comments
Comments powered by Disqus