LFSRs and Python random

Blog post or writeup? I don’t know

This is a combination of an explanation of LFSRs and a writeup of the following challenge I made for a recent CTF:

1import functools, operator, random
2random.seed(31337)
3print((functools.reduce(operator.xor, (random.getrandbits(256) for _ in range(73133736**7)))^0x458d5dbe9dd57b3a8780cfe94cbb36f14bf4f33ce4179eccecb4bd674de26d47).to_bytes(32, 'big').decode())

The tl;dr is that a known random seed is used and then we compute random.getrandbits(256) ^ random.getrandbits(256) ^ ... ^ random.getrandbits(256) a large number of times, the result is XORed with some constant and then printed, we can safely assume that this result will be the flag.

Just run the script

Sadly the 73133736773133736^7 number of times the inner loop needs to run is around 21832^{183} which is outside the scope of the world’s current computing capabilities, so we need to find a mathematical shortcut.

MT19937

The implementation of the random module in CPython (the implementation everyone uses) is based on the RNG known as MT19937 (Mersenne Twister 19937) and falls in the class of LFSRs (Linear Feedback Shift Registers).

This is what an explicit implementation of MT19937 might look like

 1class MT19937:
 2	w, n = 32, 624
 3	f = 1812433253
 4	m, r = 397, 31
 5	a = 0x9908B0DF
 6	d, b, c = 0xFFFFFFFF, 0x9D2C5680, 0xEFC60000
 7	u, s, t, l = 11, 7, 15, 18
 8
 9	def __init__(self, seed):
10		self.X = [0] * MT19937.n
11		self.cnt = 0
12		self.initialize(seed)
13
14	def initialize(self, seed):
15		...  # not important
16
17	def twist(self):
18		for i in range(MT19937.n):
19			lower_mask = (1 << MT19937.r) - 1
20			upper_mask =  (~lower_mask) & ((1 << MT19937.w) - 1)
21			tmp = (self.X[i] & upper_mask) + (self.X[(i + 1) % MT19937.n] & lower_mask)
22			tmpA = tmp >> 1
23			if tmp & 1:
24				tmpA ^= MT19937.a
25			self.X[i] = self.X[(i + MT19937.m) % MT19937.n] ^ tmpA
26		self.cnt = 0
27
28	def temper(self):
29		if self.cnt == MT19937.n:
30			self.twist()
31		y = self.X[self.cnt]
32		y = y ^ ((y >> MT19937.u) & MT19937.d)
33		y = y ^ ((y << MT19937.s) & MT19937.b)
34		y = y ^ ((y << MT19937.t) & MT19937.c)
35		y = y ^ (y >> MT19937.l)
36		self.cnt += 1
37		return y & ((1 << MT19937.w) - 1)

“Naive” approach

You will notice that it contains a state of 624624 3232-bit elements and that the twist operation modifies this whole state. We then sequentially output each of the 624624 elements when a new random element is requested in temper, and when we run out of elements we call twist again to refresh the state.

The important parts to notice here is that everything is linear over F2\mathbb{F}_2, both in twist and temper. The XORs and shifts are linear operators which we can neatly blackbox into “some linear operator”. This means that if we denote the initial state as x0x_0 then we have that x1=Tx0x_1 = T x_0, for some matrix TT which represents the twisting operation, and more generally that xi=Tix0x_i = T^i x_0.

This state vector encompasses several 3232-bit outputs, each of which need to be run through temper. So let EE be the 32×3232 \times 32 matrix which represents temper, and then let

Efull=[E000E000E] E_{\text{full}} = \begin{bmatrix} E & 0 & \cdots & 0 \\ 0 & E & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & E \end{bmatrix}

With 624624 blocks of EE on the diagonal. This will apply temper to each 3232-bit slice of the state.

The actual values which are output will be 256256-bit slices of these state vectors. We are dealing with 256256-bit outputs of which there fits (62432)/256=78(624 \cdot 32) / 256 = 78 in each state. Therefore we finally let InI_n be the n×nn \times n identity matrix, and let

S=[I256I256I256] S = \begin{bmatrix} I_{256} & I_{256} & \cdots & I_{256} \end{bmatrix}

With 7878 identity matrices spread out. This will essentially sum all the 256256-bit slices of the state into a single 256256-bit number. Putting this all together means that the XOR-sum of outputs [78*i:78*(i+1)] is given by

SEfullxi=SEfullTix0 S E_{\text{full}} x_i = S E_{\text{full}} T^i x_0

Consequently the total sum that we are interested in is given by.

SEfullx0+SEfullTx0+SEfullT2x0+...+SEfullT73133736778x0=SEfull(I+T+T2+...+T73133736778)x0 \begin{align*} &S E_{\text{full}} x_0 + S E_{\text{full}} T x_0 + S E_{\text{full}} T^2 x_0 + ... + S E_{\text{full}} T^{\frac{73133736^7}{78}} x_0 \\ = &S E_{\text{full}} (I + T + T^2 + ... + T^{\frac{73133736^7}{78}}) x_0 \end{align*}

I was nice and made it so that 7873133736778 | 73133736^7. This is still a very long sum, but since it is a geometric series we can use the formula for the sum of a geometric series to get

SEfull(IT)1(IT73133736778)x0 S E_{\text{full}} (I - T)^{-1} (I - T^{\frac{73133736^7}{78}}) x_0

This only involves calculating TT to a large exponent, and this is doable in only logn\log n matrix multiplications using the binary exponentiation algorithm.

This is still a pretty huge matrix so it will take a couple of minutes to compute, but it is completely doable.

The cooler approach

There is a way to compute this in less than a second and it trades linear algebra for abstract algebra. This isn’t a full explanation if you don’t know about LFSRs and minimal polynomials already, it’s more of a brief overview. The solve script is as follows

 1from sage.all import *
 2
 3import random
 4from sage.matrix.berlekamp_massey import berlekamp_massey
 5from functools import reduce
 6from operator import xor
 7
 8n = 73133736**7
 9minpoly = berlekamp_massey([GF(2)(random.getrandbits(256)) for _ in range(19937*2)])
10
11x = GF((2, 19937), 'x', modulus=minpoly, check_irreducible=False).gen()
12
13random.seed(31337)
14totsum = reduce(xor, (random.getrandbits(256) * bool(c) for c in (1 - x**n) / (1 - x)))
15totsum ^= 0x458d5dbe9dd57b3a8780cfe94cbb36f14bf4f33ce4179eccecb4bd674de26d47
16
17flag = totsum.to_bytes(32, 'big').decode()
18print(flag)

The berlekamp_massey function is used to find the minimal polynomial of the LFSR, which we happen to know beforehand to be of degree 1993719937, which is why we need at least 19937219937 \cdot 2 samples because of how Berlekamp-Massey works. Let p(x)p(x) be this minimal polynomial, the defining property is that if we let

p(x)=a0+a1x+a2x2+...+akxk p(x) = a_0 + a_1 x + a_2 x^2 + ... + a_k x^k

And let x0,x1,x2,...x_0, x_1, x_2, ... be the output of the LFSR, then we have that

i=0kaixi=0 \sum_{i=0}^k a_i x_i = 0

We essentially treat each output as sequentially higher powers of the polynomial indeterminate xx.

The important thing to notice is that any expression which is a sum of LFSR outputs can be written as a polynomial, and if we reduce this polynomial modulo p(x)p(x) (since we know it to be zero) then we can always obtain a polynomial of degree k=19937\leq k = 19937 which evaluates to the same value.

This means that such expressions can actually be seen as part of the quotient ring F2[x]/p(x)\mathbb{F}_2[x]/\langle p(x) \rangle, and since p(x)p(x) happens to be irreducible this is F2deg(p)=F219937\cong \mathbb{F}_{2^{deg(p)}} = \mathbb{F}_{2^{19937}}. This is a commonly used field so the implementation which Sage uses is very efficient, this is what’s constructed and which x is then set to be the generator of (i.e the polynomial indeterminate).

We then once again use the sum of a geometric series to compute

1+x+x2+...+xn1=1xn1x1 + x + x^2 + ... + x^{n-1} = \frac{1 - x^n}{1-x}

where n=731337367n=73133736^7. The result will once again be a polynomial of degree 19937\leq 19937 which tells us that the expression we just computed is equal to a certain linear combination of the first 1993719937 outputs of the LFSR.

We thus seed our RNG with the known seed and compute these first 1993719937 outputs, XOR them together in the way described by the polynomial and this will be our final result.