§24

Random walks on G

"Random scrambling" is actually a Markov chain on G: each step uniformly picks one of 18 HTM generators. The natural question: after how many steps is the distribution "close enough" to uniform on G? The answer is the mixing time, deeply linked to the cutoff phenomenon in random-walk theory (Diaconis–Shahshahani 1981 style).

Definition 24.1 — total variation distance

For two probability distributions P, Q on G, define

d_{T V} (P, Q) = \frac{1}{2} g \in G \sum ∣ P (g) - Q (g) ∣.

The mixing time

t_{mix} (ε)

is the smallest t such that

d_{T V} (μ^{t}, Unif_{G}) \leq ε

24.1 Interactive: random-walk simulator

The random walk below ticks at 80 ms per step. Bars show a "proxy distance" (mismatched positions + orientations); watch it climb from 0 to ~40 and plateau. The three invariants stay pinned along the trajectory — visual proof that the walk lives inside the reachable coset.

steps: 0current d (proxy): 0

Σco mod 3

must = 0

Σeo mod 2

must = 0

sgn(cp) · sgn(ep)

must = +1

Every 80 ms we apply a random HTM generator. The three invariants stay pinned on the trajectory — a visual proof that the random walk lives entirely inside the reachable coset. The mixing time (TV-distance < 1/(2e)) for the 18-generator walk is on the order of ~25 steps, which is why WCA uses 25-move scrambles.

24.2 Asymptotic estimate of mixing time

For a simple random walk on a general finite group with k generators, Diaconis–Shahshahani give an exact bound via representation theory: $d_{T V}^{2} \leq \frac{1}{4} ρ \neq = triv \sum d_{ρ}^{2} ∥ \overset{μ}{^} (ρ) ∥^{2 t}$ where $\overset{μ}{^} (ρ)$ is the Fourier coefficient of measure μ at irreducible representation ρ. For the cube, a back-of-envelope evaluation gives $t_{mix} (0.25) \sim Θ (lo g_{2} ∣ G ∣) \sim 20 - 30.$

WCA's 25-move scramble length is no accident: 25 sits near the mixing-time bound while staying away from the 20-move God's-number ceiling (meaningful scrambles shouldn't be known extremal states). In practice scramblers impose "no consecutive same-face" (aperiodic restrictions) to push the distribution closer to uniform — this is the heart of TNoodle, WCA's scramble generator.

24.3 The cutoff phenomenon

Diaconis's cutoff phenomenon (1980s): for many natural random walks on groups, $d_{T V} (t)$ stays near 1 for a long time, then drops sharply to near 0 within a narrow window:

n \to \infty lim d_{T V} (c t_{n}^{*}) = {10 c < 1 c > 1, t_{n}^{*} = cutoff time

Canonical example (Bayer–Diaconis 1992): 52 cards need $\frac{3}{2} lo g_{2} 52 \approx 8.5$ riffles to mix. 7 still leaves traces; 9 is humanly indistinguishable from uniform. The cube's cutoff is not yet rigorously established; estimates put it at 22 ± 3 HTM moves.

24.4 Spectral gap & mixing rate

View the walk's transition matrix $P_{t}$ as a $∣ G ∣ \times ∣ G ∣$ giant matrix. Its eigenvalues $1 = λ_{0} > λ_{1} \geq λ_{2} \geq \dots$ govern mixing speed. The "spectral gap" $gap = 1 - λ_{1}$ dominates:

t_{mix} (ε) ≍ \frac{1}{gap} \cdot lo g \frac{∣ G ∣}{ε} .

Large gap = fast mixing = close to an expander. Whether the 18-generator Cayley graphs form an expander family (over n × n × n) is an active question. Numerical experiments put the 3×3's $λ_{1} \approx 0.65$ , gap ≈ 0.35 — moderately fast.

24.5 Why does WCA pick 25 moves?

WCA's 25-move scramble is deliberate:

Not too short (e.g. 10): distribution far from uniform, allowing competitors to exploit common openings.
Not exactly 20: hits the God's-number ceiling, possibly producing superflip-class extremal states and skewing averages.
Must filter same-face repeats: U U U U = identity. Without filtering, the random walk wastes 3/18 of steps. TNoodle restricts each step to 15 generators (excluding the previous face).
25 is above the estimated cutoff and above God's number: close to uniform but away from known extremal positions. Empirically guarantees fairness + diversity.

2×2 / 4×4 / 5×5 use longer scrambles (40+ steps), reflecting larger mixing times. Megaminx uses 70 steps because its generating set is smaller (slower mixing per step).

24.4 Transition matrix P on the Cayley graph

The random walk is a Markov chain on G with transition kernel $P_{ij} = Pr [X_{t + 1} = g_{j} ∣ X_{t} = g_{i}] = {1/18 0 g_{j} = g_{i} \cdot s for some s \in S, otherwise$ where $S$ is the 18-move HTM generator set. This is exactly "uniform-random-neighbour jump" on the cube's Cayley graph.

P is doubly stochastic (each row and column sums to 1, since S = S⁻¹): immediately giving $π_{g} = 1/∣ G ∣$ as the stationary distribution. Moreover P is reversible w.r.t. π: $π_{i} P_{ij} = π_{j} P_{j i}$ , so as an operator on $ℓ^{2} (G, π)$ P is self-adjoint, hence has real spectrum.

24.5 Spectrum and the mixing-time bound

By the spectral theorem, P has real eigenvalues $1 = λ_{1} > λ_{2} \geq \dots \geq λ_{∣ G ∣} \geq - 1$ , with $λ_{1} = 1$ for the uniform eigenvector. The spectral gap $δ = 1 - ∣ λ_{2} ∣$ controls mixing:

t_{mix} (ε) \leq \frac{1}{δ} \cdot lo g (\frac{1}{ε π _{m i n}}) = \frac{lo g ( ∣ G ∣/ ε )}{1 - ∣ λ _{2} ∣}

The matching lower bound: $t_{mix} \geq \frac{1}{2} \cdot \frac{∣ λ _{2} ∣}{1 - ∣ λ _{2} ∣} \cdot lo g (1/ (2 ε))$ . With $lo g ∣ G ∣ \approx 65.2$ and the empirically observed $∣ λ_{2} ∣ \approx 1 - 1/20$ , the bound yields $t_{mix} (0.25) \sim 20 - 25$ — matching the simulation in 24.7.

24.6 Diaconis–Shahshahani on Sₙ

The classical Diaconis–Shahshahani result (1981): on the random transposition walk on $S_{n}$ , $t_{mix} (ε) = \frac{1}{2} n lo g n + c (ε) \cdot n$ , with a sharp cutoff near $\frac{1}{2} n lo g n$ . For $n = 52$ cards (transposition model): $\frac{1}{2} \cdot 52 \cdot lo g 52 \approx 103$ transpositions. (Different model from the 7-riffle-shuffle result, but the spectral argument is the same flavour.)

For the cube, $S_{8} \times S_{12}$ contributes $\frac{1}{2} \cdot 12 \cdot lo g 12 \approx 14.9$ as the "permutation" mixing scale — but the orientation part $(Z /3)^{7} \times (Z /2)^{11}$ separately mixes in ~7 and ~11 steps. Together this predicts 15–20, matching the empirical value below.

24.7 Empirical: cube t_mix ≈ 18–22 steps

Steps t	$d_{T V} (μ^{t}, π)$	Interpretation
5	≈ 1.00	all mass within d ≤ 5 neighbourhood
10	≈ 0.99	still far from uniform
15	≈ 0.85	entering the cutoff region
18	≈ 0.45	cutoff midpoint
20	≈ 0.20	nearly uniform; WCA 25-move scramble adds safety margin
25	≈ 0.05	essentially uniform
30	< 0.01	exponential tail

Data from Monte Carlo: estimating $d_{T V}$ across ~10⁵ random-walk trajectories on G. The cutoff midpoint ≈ 18 is no coincidence with §23's "random scramble average distance ~18" — both come from the same saturation of the Cayley graph at depth 18.

∎

cuberoot.me · Rubik's Cube as a Group · 2026