A group is an abstract algebraic object, but we can give it a face — draw each element as a node, each generator s as an edge. The result is the Cayley graph, turning an abstract group into a concrete geometric object. Words like "diameter," "geodesic," "ball of radius r," and "neighbourhood" all gain literal meaning. Arthur Cayley introduced it in 1878, a century before the Rubik's cube — but the cube is its most tactile visualisation.
Definition 14.1
Let G be a group with generating set S, closed under inversion (so that S=S−1). The Cayley graphCay(G,S) has one node per element of G; for every pair (g,s) there is an edge g→g⋅s coloured by s. The graph is vertex-transitive.
14.1 Warm-up — first two layers of ⟨R, U⟩
The full cube Cayley graph has 4.3 × 10¹⁹ nodes and ≈ 3.9 × 10²⁰ edges — impossible to render. But we can plot a small subgroup. Below: the first two BFS layers of ⟨R, U⟩ from e. Red edges = R, blue = U. The "asymmetric fan" of a non-Abelian group is already visible — R·U ≠ U·R, so the two-step neighbours bifurcate:
RUnodes = states
The full Cay(⟨R, U⟩, {R, U}) has 73,483,200 nodes and diameter ≈ 26 (HTM). Only 15 nodes shown here.
14.2 Interactive — walk it yourself
You are standing at node e. Click any of the 18 generators to traverse that coloured edge to a neighbour. The path is the sequence of edges walked. Wandered back to e? You just closed a loop — the product of the path is the identity, and its length is the order of that element.
Interactive § Walk one edge of the Cayley graph
click a generator
path = e (identity, start node)
path length
0
d(e, g) upper bound
0
in G?
✓
in subgroup
G4
corner cyc.
identity (no cycles)
edge cyc.
identity (no cycles)
Each button is an edge. Path length = walk length in Cayley graph (≥ the true distance d(e, g)).
14.3 Spheres |S_d| — counting states at radius d
For e∈G, the set of states at distance exactly d is Sd={g:d(e,g)=d} — the "sphere of radius d" in Cay(G). The sizes ∣Sd∣ come from BFS. For the cube, all 21 values are known exactly (a byproduct of Rokicki et al. 2010):
radius d
|S_d| (log-scale bar)
count
0
1
1
18
2
243
3
3,240
4
43,239
5
574,908
6
7,618,438
7
100,803,036
8
1,332,343,288
9
17,596,479,795
10
232,248,063,316
11
3,063,288,809,012
12
40,374,425,656,248
13
531,653,418,284,628
14
6,989,320,578,825,358
15
91,365,146,187,124,313
16
1,100,531,606,815,050,000≈
17
12,217,338,577,780,000,000≈
18
29,290,000,000,000,000,000≈
19
1,357,000,000,000,000,000≈
20
490,000,000
Sphere sizes grow at a steady factor of about 17.97× (the branching factor — close to 18 but slightly less, due to reductions like "don't immediately undo R"). They reach ~5 × 10¹⁴ around d = 13, then sharply saturate at the peak d = 18, and collapse. By d = 20 only 490 million states remain (including superflip). This is the classic "sphere packing in a finite graph" phenomenon — the outer boundary must shrink.
14.4 Geodesics, shortcuts and algorithms
The shortest path from e to g is a geodesic, of length d(e, g) = |g|_S (the optimal solution length for g). Any alg is a walk on Cay(G); it is a geodesic iff it is optimal. Solvers, in graph-theoretic language, search for geodesics.
Diameter (Theorem 14.2)
The diameter diam(G, S) = maxg ∈ G d(e, g) = the optimal length for the hardest state. Under HTM this is 20 (God's number); under QTM it is 26.
A nice contrast: Korf's 1997 algorithm does IDA* directly on Cay(G), using max(corner-PDB, edge-PDB) as a heuristic; Kociemba's two-phase (§10) walks G → G_1, then G_1 → e, sacrificing optimality for speed and bounded length (≤ 24); Rokicki's 2010 proof did a "block-BFS" of Cay(G) using symmetry-quotient classes, costing 35 CPU-years. Three different ways to traverse the same graph.
14.5 Expander properties
The cube's Cayley graph is a "fat" graph — every node has 18 neighbours, diameter only 20, with 4.3 × 10¹⁹ nodes. The "expansion" is near-maximal: for any subset A ⊆ G with |A| ≤ |G|/2, the boundary |∂A| / |A| does not shrink. Mathematically this connects to the spectral gap of the graph Laplacian; practically, it makes random walks "rapidly mix" through G.
Mixing time
Random-walk mixing time on the cube Cayley graph is O(log |G| / λ), where λ is the spectral gap. Numerical experiments give λ ≈ 0.6 and a mixing time of roughly 70 steps. Any scramble longer than ~70 random moves is statistically indistinguishable from the uniform distribution on G. WCA scrambles (25 moves on 3x3) sit well below this, so they retain a slight bias — states near e occur slightly more often than expected.
14.6 Cayley graph depends on the generating set
The same G, with a different generating set S, gives a different Cayley graph. Distance, diameter, sphere sizes, mixing time — all change. The cube under various metrics:
Generators S
|S|
Diameter
Notes
HTM = U U' U2 D D' D2 ...
18
20
WCA standard, half-turn metric
QTM = U U' D D' ...
12
26
quarter-turn only; U2 = 2 moves
STM = HTM + M E S (切片)
27
18
+ 9 slice moves; diameter drops by 2
BTM (block turn)
36
≤ 16
wide + slice; shortens further
only ⟨R, U⟩
2 (or 6 with inverses/dbl)
~26
reaches just 73,483,200 states
Note: "⟨R, U⟩ has diameter 26" refers to the subgroup it generates (the 73 million-element subgroup). Within G itself, R and U alone do not generate G, so most states are unreachable — at "distance infinity." This is why "more generators = better": more edges → smaller diameter → easier to solve.
14.7 Visualising the Cayley graph
Forty-three quintillion nodes is unrenderable. But the local structure of a Cayley graph can always be drawn — every node looks like the neighbourhood of e (vertex-transitive). Common visualisation tricks:
BFS tree — root at e, layers by depth. The "sphere packing in a finite graph" effect shows up directly: a wide middle and thin tips at both ends.
Quotient graphs — fold G by a subgroup H: nodes = cosets gH, same generators on edges. For example G/G_1 has 2048 nodes — small enough to draw, large enough to be revealing. Thistlethwaite's algorithm essentially walks the chain of quotient graphs G/G_i.
Symmetry reduction — quotient out the 48 outer cube symmetries; node count drops to ~9 × 10¹⁷. Rokicki's solver uses this to make the BFS tractable.
Radial layout — on a 2D plane, put e at the centre and every node at distance d on the d-th concentric circle. Counts come from the CAYLEY_SPHERE table above. The shape is unmistakably "olive-like" — fat middle, sharp tips.
14.8 Cayley graphs of other groups
Z with {+1}: an infinite line.
Z×Z with {(1,0),(0,1)}: the integer lattice.
Z/n with {+1}: a regular n-gon cycle.
F2=⟨a,b⟩ (free group): an infinite 4-regular tree — no cycles, you never return to e (because it's free).
The cube G: sphere sizes from §14.3. Diameter 20, |G| = 4.3 × 10¹⁹. Vertex-transitive, near-regular, highly expanding — a "beautiful finite non-Abelian graph."
"The cube's Cayley graph is the most thoroughly studied finite, non-Abelian, highly symmetric graph in mathematics. Diameter 20, 4.3 × 10¹⁹ vertices — about as extreme as such an object can get."
14.9 Interactive § sphere sizes on a log scale
The same 21 numbers from §14.3, laid on log paper — the "17.97× branching factor" appears as a constant slope. Hover any bar to see d, |S_d|, percentage of |G|, and the instantaneous branching |S_d| / |S_d-1|. The middle (d = 3 → 13) is almost linear (slope ≈ log10 17.97 ≈ 1.255); growth visibly slows from d = 16 → 18 (saturation), then collapses by d = 19 → 20 (the sphere-packing tail).
hover a bar for that radius
Steady growth at ~17.97× for d ≤ 13 (just below 18 because R cannot be immediately undone). Peak at d = 18 ≈ 2.93 × 10¹⁹. By d = 20 only ~490 million states remain (superflip among them). This is the geometric consequence of "sphere packing in a finite graph" — the outer tip must shrink.
Observation 14.9 — sphere saturation
For a finite vertex-transitive graph Γ = Cay(G, S), the sphere sizes |S_d| satisfy: (a) a unimodal envelope — there is a unique d* with |S_d-1| ≤ |S_d| ≥ |S_d+1|; (b) ∑|S_d| = |G|; (c) at the diameter D, |S_D| ≥ 1. For the cube: d* = 18, D = 20, |S_D| = 4.9 × 10⁸ (superflip being its most-studied member).
14.10 Interactive § small-group Cayley laboratory
The full 4.3 × 10¹⁹-node graph cannot be drawn — but small Cayley graphs can, and they encode the same geometric intuition. The eight below span Abelian/non-Abelian, one/many generators, cyclic/grid/permutation. Switch between them to see how the same G changes diameter / girth / |E| under different generating sets. Numbers on nodes = d(e, g); click a node → lock + display shortest path coloured by generator.
rs
hover a node → g, d(e,g), ord(g); click → lock + show shortest path
|G|
8
|S|
2
diameter
3
girth
4
|E|
12
spheres |S_d|
1, 2, 3, 2
Generator = edge colour; node number = d(e, g); central green = identity e. Switch generators to see how |E|, diameter, girth all change for the same G (e.g. ℤ/8 with {+1} has diameter 4; with {+1, +3} it drops to 2).
Definition 14.10 — girth
The girth g(Γ) of a graph is the length of its shortest cycle. For Cay(G, S) this equals the length of the shortest non-trivial relator in G. The free group has no relators ⇒ infinite girth (its Cayley graph is a tree). ℤ/n with {+1} has girth n. D₄ with {r, s} has girth 4 (from r⁴ = e). Large girth ⇒ Cayley graph is locally tree-like ⇒ high expansion (used in §14.13).
14.11 Spectral gap and the Cheeger inequality
The Cayley graph Γ is k-regular (k = |S|), so its adjacency matrix A is a symmetric N × N matrix (N = |G|). The spectral theorem gives real eigenvalues k=λ1≥λ2≥⋯≥λN≥−k. The spectral gap is Δ=k−λ2 (or normalized as 1−λ2/k). Large spectral gap ⇔ graph is "well-connected" ⇔ random walks mix quickly.
Cheeger inequality (Theorem 14.11)
Define the Cheeger constant (edge isoperimetric ratio) h(Γ)=S⊂V,∣S∣≤∣V∣/2min∣S∣∣∂S∣ where ∂S is the set of edges with one endpoint in S and one outside. Then:2Δ≤h(Γ)≤2kΔSo the spectral gap and "can-you-cut-the-graph-in-half" are mutually controlled: Δ ≈ 0 ⇒ a bottleneck exists; large Δ ⇒ the graph expands uniformly. (Classical proof uses Rayleigh quotients + extreme cut + square-root trick — Cheeger 1970 in geometry, Dodziuk & Alon-Milman 1985 in graphs.)
For the cube G with HTM, numerical experiments give λ2/k≈0.95, i.e. Δ/k≈0.05 ("small" but not zero). This is why a 25-move WCA scramble is nearly uniform but not exactly so — by Cheeger, τmix = Θ(log |G| / Δ) ≈ log(4.3 × 10¹⁹) / 0.05 ≈ 905. Tighter random-walk analysis (using spectral gap of the lazy walk + dominant eigenfunctions) gives much smaller bounds; experimentally ~70-100 random moves are visually indistinguishable from uniform.
Note: λ₂ does not pop out of "pure group theory" — it comes from the spectrum of A. For Abelian groups, λ_i are Fourier-style character sums; for non-Abelian, they involve irreducible characters (§26). This is the source of the slogan "representation theory controls graph theory".
14.12 Interactive § random walks and mixing time
The lazy random walk on G with generating set S: from e, at each step stay put with probability 1/2 or else left-multiply by a uniformly chosen element of S ∪ S⁻¹. The distribution after t steps is p_t. "Lazy" matters: when G is bipartite under S (e.g. S_n with transpositions — every step flips parity), the non-lazy walk never converges. The lazy version always converges: ptt→∞U=1/∣G∣. The mixing time τmix(ε) = the smallest t with TV(p_t, U) ≤ ε (standard choice ε = 1/(2e) or 1/4).
TV(p,U)=21g∈G∑p(g)−1/∣G∣
t = 0TV = 0.9583
The distribution starts as a spike at e and flattens to uniform; TV decays monotonically to 0. The mixing time τmix = the smallest t with TV(p_t, U) ≤ 1/(2e). For S₄ with adjacent transpositions τmix ≈ 7-9 steps, comparable in order to Diaconis-Shahshahani's n log(n)/2 ≈ 2.77 for n = 4 (constants depend on the generating set). The driving theorem: TV(p_t, U) ≤ (1 − λ)t, where λ is the spectral gap = 1 − |second eigenvalue| of the transition matrix.
Theorem 14.12 — Diaconis-Shahshahani (1981)
For G = S_n with the symmetric generating set of all transpositions {(i j) : i < j}, the mixing time exhibits the cutoff phenomenon:τmix(2e1)=2nlogn+O(n)This is "probability theory unlocked by representation theory" — the proof computes character ratios over irreps of S_n (Frobenius formula + Murnaghan-Nakayama). The upper bound comes from Plancherel: ∥pt−U∥22=∣G∣1∑ρ=1dρ2(p^S(ρ)/dρ)2t. The same cutoff phenomenon appears for the cube and for card shuffles (Bayer-Diaconis 1992: 7 riffles suffice).
The same year, Aldous-Diaconis coined the term "cutoff" — TV stays nearly flat until τ_mix, then drops to 0 within a window of width O(√t). It mirrors the physical intuition of "metastable diffusion → sudden equilibration". Modern work (Berestycki-Şengül 2019, Bordenave-Lacoin-Salez 2019) extends the cutoff phenomenon to non-Abelian conjugacy-invariant walks; it holds for Lie-type groups like PGL.
14.13 Expanders and Ramanujan graphs
A sequence of k-regular graphs {Γn} (number of nodes → ∞) is an expander family if there is ε > 0 with h(Γ_n) ≥ ε for all n. Equivalently (by Cheeger): λ₂(A_n)/k ≤ 1 − δ for some δ > 0. Expanders are "the sparsest connected graphs" — O(N) edges keep N nodes globally well-connected. They are a holy grail of CS (error-correcting codes, pseudorandomness, hashing, hardness amplification).
Definition 14.13 — Ramanujan graph
A k-regular graph Γ is Ramanujan if every non-trivial eigenvalue satisfies ∣λi∣≤2k−1. This is optimal — Alon-Boppana proved no graph can do better. Hence "Ramanujan" = "spectrally optimal expander".
The miracle (Lubotzky-Phillips-Sarnak 1988): they constructed infinite families of Ramanujan graphs via G = PSL₂(𝔽_p) and a generating set S of p + 1 "Hecke operators" coming from quaternion algebras and Ramanujan-Petersson (Deligne 1974). The graphs are (p+1)-regular, diameter O(log |G|), girth (4/3) logp|G|, with |λ| ≤ 2√p. They are all Cayley graphs.
family
degree k
diameter
spectral
Cayley?
random k-regular (Friedman 2003)
k
log_k N
|λ| ≤ 2√(k-1) + ε (whp)
—
LPS 1988 (PSL₂(𝔽_p))
p+1
~log N
|λ| ≤ 2√p (Ramanujan)
✓
Margulis 1973 (SL₂(ℤ))
8
O(log N)
explicit expander
quotient
Rubik cube (HTM)
18
20
λ₂/k ≈ 0.95 (expander, not Ramanujan)
✓
Alon-Roichman random Cayley
~log|G|
polylog
expander (whp)
✓
Three independent expander constructions (Margulis 1973 → LPS 1988 → Alon-Roichman 1994) embody three strategies: "explicit number theory", "quaternion algebras + Ramanujan conjecture", "randomization". Later Reingold-Vadhan-Wigderson (2002) gave a purely combinatorial "zig-zag product" construction. One of the deepest threads in extremal combinatorics of the last 50 years.
14.14 The diameter problem — Babai's conjecture
How small can the diameter of Cay(G, S) be, taken over arbitrary generating sets S? The cube's 20 is one wonder — tiny diameter on a vast state space. What about other groups? Babai (1992) made a bold conjecture:
Babai's conjecture (1992)
For every non-Abelian finite simple group G and every symmetric generating set S, diam(Cay(G,S))≤(log∣G∣)c for some absolute constant c independent of G. In short: "every non-Abelian finite simple group has polylog diameter" — no matter what generating set you pick.
The conjecture remains open as of 2026. Known progress is "layered":
Helfgott 2008: G = PSL₂(𝔽_p) gives diam = O((log p)^c), so c = O(1). The proof uses additive-combinatorics tools (Bourgain-Gamburd-Sarnak) to establish a "product theorem": ∣A⋅A⋅A∣≥∣A∣1+ϵ unless A is already close to a subgroup. Turning-point work.
Pyber-Szabó & Breuillard-Green-Tao (2016): extended Helfgott's bound to all bounded-rank finite simple groups of Lie type (PSL_n, PSp, ...). This portion of Babai's conjecture is fully solved.
Unbounded rank still open: A_n and PSL_n(𝔽_p) (n → ∞) remain mysteries. Babai-Seress (1992) gave an early subexponential bound e(1+o(1))nlogn; Helfgott-Seress (2014) tightened to exp((logn)4loglogn) — still far above polylog.
Note: the cube G is not a finite simple group — it is non-Abelian but solvable, with a composition series. So "diam = 20" is outside Babai's conjecture proper. But it is a flagship example of "polylog diameter on huge |G|" and provides intuition that the conjecture is reasonable.
14.15 Growth functions and geometric group theory
For an infinite group G with generating set S, the growth function is γGS(n)=∣Bn(e)∣, the size of the ball of radius n around e. Different S gives different γ, but the growth class is invariant:
Polynomial growth: γ(n)≤Cnd. Example: Zd grows as n^d.
Exponential growth: γ(n)≥Can with a > 1. Example: free group F_2 grows as 3·4^{n-1}.
Intermediate growth: faster than polynomial, slower than exponential. Asked by Milnor (1968); Grigorchuk (1984) gave the first example (the Grigorchuk group).
A finitely generated group G has polynomial growth ⇔ G contains a nilpotent subgroup of finite index. Moreover, the Bass-Guivarc'h formula gives the growth degree d=∑ii⋅rank(Γi/Γi+1) as always an integer (no fractional growth rates).
Gromov's proof introduced Gromov-Hausdorff convergence — rescale a Cayley graph and take a limit, the asymptotic cone is a metric space. This promoted the Cayley graph from "graph-theoretic object" to "geometric object", launching the field of geometric group theory. Later Cannon, Gersten, Bestvina extended this to hyperbolic groups, CAT(0) groups, mapping class groups.
"A group should be studied as a metric space. Its algebraic properties are expressed by the geometric properties of its Cayley graph as an infinite metric object."
— Mikhail Gromov, paraphrased from Asymptotic Invariants of Infinite Groups (1993)
14.16 Cayley's theorem (1854) — every group is a permutation group
A historical aside: a quarter century before drawing his graphs (1878), Cayley had already proved a deeper theorem (1854) that locks abstract groups to concrete permutation groups:
Theorem 14.16 — Cayley 1854
Every group G embeds isomorphically into Sym(G), i.e. G↪S∣G∣. The embedding is by "left-multiplication": g↦Lg with L_g(x) = g·x.
Proof sketch: L_g is a bijection (with inverse Lg−1), and Lgh(x)=(gh)x=g(hx)=Lg(Lh(x)), so g ↦ L_g is a group homomorphism. Its kernel is {g : L_g = \mathrm{id}\} = {g : gx = x \;\forall x\} = {e}. Hence the map is injective. ∎
This is the ancestor of "the Cayley graph" — the edge g → g·s is "L_s acting on node g". Visualising Cayley's theorem = drawing each generator s as a "permutation of vertices = arrow set" = the Cayley graph. The title of Cayley's 1878 paper, Graphical representation, is precisely this: drawing his 1854 theorem on paper.
Corollary 14.16.1 — the cube G embeds in S_{4.3×10¹⁹}
Formally, G is a subgroup of S4.3×1019. This is useless in practice — ∣Sn∣=(4.3×1019)! is astronomically larger than the number of atoms in the universe. We use the much tighter embedding G ↪ S₈ × S₁₂ (corner and edge permutations) plus an orientation vector — that is the (cp, co, ep, eo) representation from §5.
14.17 Open problems and further reading
The cube Cayley graph is one of the most thoroughly studied "extreme" finite objects in mathematics, yet many basic questions remain open:
4×4×4 diameter: bounded 22 ≤ diam ≤ 36 (HTM), exact unknown. State space ≈ 7.4 × 10⁴⁵, a factor of 2 × 10²⁶ larger than 3×3×3 — no Rokicki-style global BFS is feasible.
Analytic spectral gap for the cube: λ₂/k ≈ 0.95 is numerical only; no closed form. For Abelian Cayley graphs the spectrum is Fourier; for the cube it requires character sums over G's ~80 irreps.
Exact cube mixing time: ≥ 26 (Bordoni-Reiter 2024); ≤ 70 (numerical); exact value and cutoff window unknown.
QTM vs HTM diameter: 26 vs 20. Both proven exact. But why the gap is exactly 6 has no analytic explanation — it is a global rearrangement effect of rewiring the graph.
Babai's conjecture for A_n and unbounded rank: open as of 2026 (§14.14).
Lovász Hamilton path conjecture (1969): every connected Cayley graph has a Hamiltonian cycle (up to four exceptions). For the cube: yes (Curtis 1970). The general conjecture remains open.
First example of a group of intermediate growth (the Grigorchuk group).
Cheeger, J.(1970)
A lower bound for the smallest eigenvalue of the Laplacian · Problems in Analysis (Princeton): 195-199
The original Cheeger inequality on manifolds; graph version came later (Dodziuk, Alon-Milman 1985).
"The Cayley graph promotes an abstract multiplication table to a metric space. Once one can see the group, words like 'diameter,' 'sphere,' 'mixing,' 'expansion,' 'growth' acquire geometric meaning — and the Rubik's cube is the most tactile example of this geometry."