LFSRs and Python random

This is a combination of an explanation of LFSRs and a writeup of the following challenge I made for a recent CTF:

1import functools, operator, random
2random.seed(31337)
3print((functools.reduce(operator.xor, (random.getrandbits(256) for _ in range(73133736**7)))^0x458d5dbe9dd57b3a8780cfe94cbb36f14bf4f33ce4179eccecb4bd674de26d47).to_bytes(32, 'big').decode())

The tl;dr is that a known random seed is used and then we compute random.getrandbits(256) ^ random.getrandbits(256) ^ ... ^ random.getrandbits(256) a large number of times, the result is XORed with some constant and then printed, we can safely assume that this result will be the flag.

Just run the script

Sadly the $73133736^7$ number of times the inner loop needs to run is around $2^{183}$ which is outside the scope of the world’s current computing capabilities, so we need to find a mathematical shortcut.

MT19937

The implementation of the random module in CPython (the implementation everyone uses) is based on the RNG known as MT19937 (Mersenne Twister 19937) and falls in the class of LFSRs (Linear Feedback Shift Registers).

This is what an explicit implementation of MT19937 might look like

 1class MT19937:
 2	w, n = 32, 624
 3	f = 1812433253
 4	m, r = 397, 31
 5	a = 0x9908B0DF
 6	d, b, c = 0xFFFFFFFF, 0x9D2C5680, 0xEFC60000
 7	u, s, t, l = 11, 7, 15, 18
 8
 9	def __init__(self, seed):
10		self.X = [0] * MT19937.n
11		self.cnt = 0
12		self.initialize(seed)
13
14	def initialize(self, seed):
15		...  # not important
16
17	def twist(self):
18		for i in range(MT19937.n):
19			lower_mask = (1 << MT19937.r) - 1
20			upper_mask =  (~lower_mask) & ((1 << MT19937.w) - 1)
21			tmp = (self.X[i] & upper_mask) + (self.X[(i + 1) % MT19937.n] & lower_mask)
22			tmpA = tmp >> 1
23			if tmp & 1:
24				tmpA ^= MT19937.a
25			self.X[i] = self.X[(i + MT19937.m) % MT19937.n] ^ tmpA
26		self.cnt = 0
27
28	def temper(self):
29		if self.cnt == MT19937.n:
30			self.twist()
31		y = self.X[self.cnt]
32		y = y ^ ((y >> MT19937.u) & MT19937.d)
33		y = y ^ ((y << MT19937.s) & MT19937.b)
34		y = y ^ ((y << MT19937.t) & MT19937.c)
35		y = y ^ (y >> MT19937.l)
36		self.cnt += 1
37		return y & ((1 << MT19937.w) - 1)

“Naive” approach

You will notice that it contains a state of $624$ $32$ -bit elements and that the twist operation modifies this whole state. We then sequentially output each of the $624$ elements when a new random element is requested in temper, and when we run out of elements we call twist again to refresh the state.

The important parts to notice here is that everything is linear over $\mathbb{F}_2$ , both in twist and temper. The XORs and shifts are linear operators which we can neatly blackbox into “some linear operator”. This means that if we denote the initial state as $x_0$ then we have that $x_1 = T x_0$ , for some matrix $T$ which represents the twisting operation, and more generally that $x_i = T^i x_0$ .

This state vector encompasses several $32$ -bit outputs, each of which need to be run through temper. So let $E$ be the $32 \times 32$ matrix which represents temper, and then let

E_{\text{full}} = \begin{bmatrix} E & 0 & \cdots & 0 \\ 0 & E & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & E \end{bmatrix}

With $624$ blocks of $E$ on the diagonal. This will apply temper to each $32$ -bit slice of the state.

The actual values which are output will be $256$ -bit slices of these state vectors. We are dealing with $256$ -bit outputs of which there fits $(624 \cdot 32) / 256 = 78$ in each state. Therefore we finally let $I_n$ be the $n \times n$ identity matrix, and let

S = \begin{bmatrix} I_{256} & I_{256} & \cdots & I_{256} \end{bmatrix}

With $78$ identity matrices spread out. This will essentially sum all the $256$ -bit slices of the state into a single $256$ -bit number. Putting this all together means that the XOR-sum of outputs [78*i:78*(i+1)] is given by

S E_{\text{full}} x_i = S E_{\text{full}} T^i x_0

Consequently the total sum that we are interested in is given by.

\begin{align*} &S E_{\text{full}} x_0 + S E_{\text{full}} T x_0 + S E_{\text{full}} T^2 x_0 + ... + S E_{\text{full}} T^{\frac{73133736^7}{78}} x_0 \\ = &S E_{\text{full}} (I + T + T^2 + ... + T^{\frac{73133736^7}{78}}) x_0 \end{align*}

I was nice and made it so that $78 | 73133736^7$ . This is still a very long sum, but since it is a geometric series we can use the formula for the sum of a geometric series to get

S E_{\text{full}} (I - T)^{-1} (I - T^{\frac{73133736^7}{78}}) x_0

This only involves calculating $T$ to a large exponent, and this is doable in only $\log n$ matrix multiplications using the binary exponentiation algorithm.

This is still a pretty huge matrix so it will take a couple of minutes to compute, but it is completely doable.

The cooler approach

There is a way to compute this in less than a second and it trades linear algebra for abstract algebra. This isn’t a full explanation if you don’t know about LFSRs and minimal polynomials already, it’s more of a brief overview. The solve script is as follows

 1from sage.all import *
 2
 3import random
 4from sage.matrix.berlekamp_massey import berlekamp_massey
 5from functools import reduce
 6from operator import xor
 7
 8n = 73133736**7
 9minpoly = berlekamp_massey([GF(2)(random.getrandbits(256)) for _ in range(19937*2)])
10
11x = GF((2, 19937), 'x', modulus=minpoly, check_irreducible=False).gen()
12
13random.seed(31337)
14totsum = reduce(xor, (random.getrandbits(256) * bool(c) for c in (1 - x**n) / (1 - x)))
15totsum ^= 0x458d5dbe9dd57b3a8780cfe94cbb36f14bf4f33ce4179eccecb4bd674de26d47
16
17flag = totsum.to_bytes(32, 'big').decode()
18print(flag)

The berlekamp_massey function is used to find the minimal polynomial of the LFSR, which we happen to know beforehand to be of degree $19937$ , which is why we need at least $19937 \cdot 2$ samples because of how Berlekamp-Massey works. Let $p(x)$ be this minimal polynomial, the defining property is that if we let

p(x) = a_0 + a_1 x + a_2 x^2 + ... + a_k x^k

And let $x_0, x_1, x_2, ...$ be the output of the LFSR, then we have that

\sum_{i=0}^k a_i x_i = 0

We essentially treat each output as sequentially higher powers of the polynomial indeterminate $x$ .

The important thing to notice is that any expression which is a sum of LFSR outputs can be written as a polynomial, and if we reduce this polynomial modulo $p(x)$ (since we know it to be zero) then we can always obtain a polynomial of degree $\leq k = 19937$ which evaluates to the same value.

This means that such expressions can actually be seen as part of the quotient ring $\mathbb{F}_2[x]/\langle p(x) \rangle$ , and since $p(x)$ happens to be irreducible this is $\cong \mathbb{F}_{2^{deg(p)}} = \mathbb{F}_{2^{19937}}$ . This is a commonly used field so the implementation which Sage uses is very efficient, this is what’s constructed and which x is then set to be the generator of (i.e the polynomial indeterminate).

We then once again use the sum of a geometric series to compute

1 + x + x^2 + ... + x^{n-1} = \frac{1 - x^n}{1-x}

where $n=73133736^7$ . The result will once again be a polynomial of degree $\leq 19937$ which tells us that the expression we just computed is equal to a certain linear combination of the first $19937$ outputs of the LFSR.

We thus seed our RNG with the known seed and compute these first $19937$ outputs, XOR them together in the way described by the polynomial and this will be our final result.