# Project Euler Problem 3: Largest prime factor

The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143 ?

#### Version 1 - Inefficient Trial Division¶

First a function to generate *all* factors. We need only test candidate factors up to $n$.

```
from six.moves import filter, map, range, reduce
```

```
factors = lambda n: list(filter(lambda x: not n % x, range(1, n+1)))
```

```
factors(24)
```

This isn't really that helpful, since we are only interested in *prime* factors. While we can implement a primality predicate and filter out composites numbers, it isn't particularly efficient. Let's rethink our approach.

Instead of iterating over all integers up to $n$, we can just iterate over the prime numbers. But first we'll need a prime number generator. The simple Sieve of Eratosthenes (250s BCE) seems like a good place to start.

#### Version 2 - Trial Division with Prime Number Sieve¶

```
def _prime_sieve_eratosthenes(n):
composites = set()
for i in range(2, n):
if i not in composites:
yield i
composites.update(set(range(i*i, n, i)))
```

```
list(_prime_sieve_eratosthenes(47))
```

```
list(_prime_sieve_eratosthenes(15))
```

Note that this function returns a list of all prime numbers strictly below $n$, so even if $n$ is prime, it will not be included.

Of course, there are ways to optimize this implementation, but this will do for now.

```
prime_sieve = _prime_sieve_eratosthenes
```

Note that we need not inspect all primes up to $n$, but only those up to $\sqrt{n}$. To see why, let's assume $p \times q = n$ and that $p > \sqrt{n}$. It cannot be the case that $q > p$, since that would imply $n = p \times q > p^2 > n$. But if $q \leq p$ and $q$ divides $n$, then it would have been detected earlier anyways. Therefore, we need only consider $p \leq \sqrt{n} \Leftrightarrow p^2 \leq n$, where we have equality if $n$ is a perfect square.

```
from collections import Counter
def prime_factors(n):
if n < 0:
raise ValueError('Requires positive integer')
elif n == 1:
raise ValueError('1 is neither prime nor composite and has no prime factors')
else:
prime_factors_ = Counter()
for p in prime_sieve(int(n**0.5)+1):
while n % p == 0:
prime_factors_[p] += 1
n /= p
if n > 1: # n has no further prime factors
prime_factors_[n] = 1
return prime_factors_
```

In accodance with the above, the last element of `prime_sieve(int(n**0.5)+1)`

will be the largest prime $p$ such that $p < \lfloor \sqrt{n} \rfloor + 1 \Leftrightarrow p \leq \lfloor \sqrt{n} \rfloor \Leftrightarrow p \leq \sqrt{n}$ since $p$ must be an integer.

```
prime_factors(8)
```

```
prime_factors(15)
```

```
prime_factors(47)
```

```
prime_factors(2)
```

```
prime_factors(1)
```

```
prime_factors(6)
```

```
prime_factors(1883)
```

```
prime_factors(13195)
```

```
a = prime_factors(600851475143); a
```

```
max(a.keys())
```

Note that we could have just used a list to keep track of the prime factors, or even a set if we don't care about the number of appearances of each prime factor. However, given that we do, the best option was to use a `Counter`

from Python's `collections`

library. Using it has benefits beyond cleaner code, as we shall see.

Recall that the least common multiple (LCM) and greatest common divisor (GCD) of integers can both be obtained through prime factorization of the integers. For the GCD, this is simply the "intersection" of the prime factors, while for LCM, this is simply the "union".

Thus, we can directly make use of a `Counter`

's default mathematical operations it supports. Namely, the following:

```
c = Counter(a=3, b=1)
d = Counter(a=1, b=2)
```

```
c & d # intersection: min(c[x], d[x])
```

```
c | d # union: max(c[x], d[x])
```

Therefore, we can directly use `prime_factor`

to implement the `lcm`

and `gcd`

functions.

```
# First lets create a function to reconstruct
# the original integer from its prime factors
reconstruct = lambda c: reduce(lambda x, y: x*y, map(lambda x: x**c[x], c))
```

```
prime_factors(48)
```

```
reconstruct(prime_factors(48))
```

```
reconstruct(prime_factors(165)) == 165
```

Looking good. Now let's implement `lcm`

and `gcd`

.

```
lcm = lambda *ns: reconstruct(reduce(lambda n, m: n | m, map(prime_factors, ns)))
```

```
gcd = lambda *ns: reconstruct(reduce(lambda n, m: n & m, map(prime_factors, ns)))
```

```
lcm(48, 180)
```

```
gcd(48, 180)
```

To walk through what this is doing:

```
c = prime_factors(48); c
```

```
d = prime_factors(180); d
```

```
c | d
```

```
c & d
```

```
reconstruct(c | d)
```

```
reconstruct(c & d)
```

All done! We must note, however, that since our integer factorization isn't particularly efficient, and even worse, that there is no known general efficient algorithm for integer factorization, this is not the best way to compute the `gcd`

and `lcm`

for large numbers. I only implement it like this here to demonstrate further applications of `prime_factors`

by taking advantage of Python's `Counter`

. Note that the `lcm`

can be obtained by having the `gcd`

and there are many efficient algorithms for computing the `gcd`

, such as the Euclidean algorithm. We shall implement them later.

## Comments

Comments powered by Disqus