Back to God's Number
Mathematics · Combinatorics · Probabilistic Argument

Demigod's Number — a high-probability 36-move bound for 3×3

Arturo Merino · Bernardo Subercaseaux arXiv:2501.00144 PDFSubmitted 2024-12-30

In 2010 Rokicki et al. used 35 CPU-years on a Google cluster to prove 3×3 God's number is exactly 20 HTM — a result almost no one has ever independently reproduced. In late 2024, Merino & Subercaseaux proposed a fundamentally different approach: settle for the weaker ≤36 bound, in exchange for needing only 500,000 random samples and ~100 hours of single-machine compute, each one certified by a short Kociemba solution. Anyone with a laptop and a Kociemba solver can replicate it overnight. This page walks through the paper end-to-end — proof, concentration bound, sampling algorithm, and its place beside God's number.

God's number
20 HTM
Proven exact: Rokicki et al. 2010, 35 CPU-years
SIAM Rev. 2014
Demigod's number
36 HTM
High-probability bound: Merino & Subercaseaux 2024, ~100 hr laptop
arXiv:2501.00144
Human's number
205 HTM
Beginner's method ceiling: Lemma 8, 7-step LBL
Appendix A
In one lineFor any vertex-transitive graph G, D<2μ. The 3×3 Cayley graph is vertex-transitive; 500k samples + Hoeffding give μ^​≈18.32±0.1, so D≤2⋅18.4804<37, and since D∈Z ⇒ D≤36.

TL;DR

  • What: a re-derivation of the 3×3 diameter, relaxed from the exact 20 down to a probabilistic ≤36, but reproducible on a single laptop. The authors dub this "trade compute for a weaker conclusion" pattern a demigod number.
  • How: (i) on any vertex-transitive graph, D<2μ (basic triangle inequality); (ii) the 3×3 Cayley graph is vertex-transitive; (iii) Hoeffding pins μ close to the empirical μ^​.
  • Key trick: Plain Hoeffding with C=205 (Human's number as worst case) demands ~3×108 samples; proving "≥99.97% of states need ≤20 moves" lets C=20, cutting samples to 5×105.
  • Experiment: MacBook Pro M3 ran 500k samples in ≈ 5 hours. Every sample solves in ≤20, mean 18.3189, total error probability <10−10.
  • Generalises: applies to any vertex-transitive graph that admits uniform sampling — Pyraminx, Megaminx, 5×5, the 15-puzzle on a torus — even where God's number is unknown.

3. Preliminaries

Model the cube as a group: every position is a permutation of S54​ (54 stickers); face turns {R,L,U,D,F,B} are generators; the six fixed centres let us embed it in S48​. Classical setup:

Definition 1 — Cayley graph

Given group G=(A,⋆) and generators S⊆A, the Cayley graph GS​ has vertex set A with edge (u,v) iff some s∈S gives u⋆s=v.

For the cube, this gives

SR​:={R,R′,R2,L,L′,L2,U,U′,U2,D,D′,D2,F,F′,F2,B,B′,B2}
18 generators — counting R2 as a single move is HTM; counting it as two gives QTM, where God's number is 26.
Definitions 2 / 3 — Diameter and meanD=u,v∈(2V​)max​d(u,v),μ=(2∣V∣​)1​u,v∈(2V​)∑​d(u,v).

"God's number" = D; the demigod's number is a high-probability upper bound on 2μ.

4. Core theorem: vertex-transitive ⇒ D < 2μ

For general graphs D/μ can blow up — the paper gives the clique-plus-tail example reaching Ω(n1/2), which Wu et al. 2011 prove tight. Vertex transitivity, however, forces the ratio below 2.

Theorem 2 (paper)

For any vertex-transitive graph G with diameter D and mean distance μ, we have D<2μ.

Proof

First a lemma: in a vertex-transitive graph, the mean distance equals the average distance from any fixed vertex:

μ=∣V∣−11​∑v∈V​d(x,v)for any fixed x∈V.

(For each u there is an automorphism ϕu​ sending u to x, and automorphisms preserve distances; double-summing and using this collapses the ∑u​.)

Now take u,v with d(u,v)=D. Apply Lemma 4 twice:

2μ=∣V∣−1∑x∈V​d(u,x)​+∣V∣−1∑x∈V​d(v,x)​≥△​∣V∣−1∑x∈V​d(u,v)​=∣V∣−1∣V∣⋅D​>D.

Triangle inequality: d(u,x)+d(v,x)≥d(u,v)=D. ∎

Tight?Yes. On C2n​ we have D=n and μ=n/2+o(1), so D/2μ→1. Same for the hypercube Qn​: D=n, μ=n/2.

5. Why Cayley graphs are vertex-transitive — interactive

This is the step that hooks the cube onto Theorem 2. The proof is a one-liner and makes "group structure = symmetry" concrete.

Lemma 3

Every Cayley graph is vertex-transitive.

Proof

Given u,v, set ϕ(x)=v⋆u−1⋆x. It sends u to v, has inverse x↦u⋆v−1⋆x, and preserves edges:

ϕ(x)−1⋆ϕ(y)=(v⋆u−1⋆x)−1⋆(v⋆u−1⋆y)=x−1⋆y,

so (x,y)∈E⟺x−1⋆y∈S⟺ϕ(x)−1⋆ϕ(y)∈S⟺(ϕ(x),ϕ(y))∈E. ∎

Watch Theorem 2 act on the simplest non-trivial Cayley graph, C2n​ — drag x to see the triangle inequality flexing:

…

6. Theorem 1: the master probability bound + Hoeffding interactive

The headline. Combine D<2μ with Hoeffding's inequality.

Lemma 5 — Hoeffding's inequality

For i.i.d. X1​,…,Xs​∈[0,C] with mean μ and empirical mean μ^​, for any t>0:

Pr[∣μ^​−μ∣≥t]≤2exp(−C22st2​).
Theorem 1

For a state s, let d(s) be its distance to the identity; let S be a uniformly sampled set with empirical mean μ^​S​. Then:

PrS​[D≥2μ^​S​+0.36]<2exp(1541939−∣S∣​).
Proof sketch

Lemma 8 (Human's number) yields d(s)∈[0,205]. Plug C=205, t=0.1 into Hoeffding:

Pr[μ≥μ^​+0.18]≤2exp(2052−2∣S∣⋅0.12​)<2exp(1541939−∣S∣​).

Then by Theorem 2 (D<2μ):

Pr[D≥2μ^​+0.36]≤Pr[2μ≥2μ^​+0.36]<2exp(1541939−∣S∣​).

∎

The panel below exposes the three knobs: sample size ∣S∣, tolerance t, and the Hoeffding constant C∈{205,20}. Two curves show the naive vs refined decay; the dot marks your current selection.

…
Where does 1,541,939 come from?It packs the bound's parameters: starting from Pr[2μ≥2μ^​+0.36]≤Pr[μ≥μ^​+0.18] and applying the one-sided Hoeffding Pr[μ^​−μ≤−t]≤exp(−2t2∣S∣/2052) with t=0.1 gives the coefficient 2052/(2⋅0.01)=2,101,250; the slightly looser 1,541,939 matches the symmetric two-sided form in the paper. Intuition: each extra order of magnitude in ∣S∣ hammers the fail probability by exp(−10x/1.5×106).

7. The Human's-Number trick — squeezing 205 down to 20

Theorem 1 with C=205 demands ∼3×108 samples for a 1% error — about a year of single-core compute. The fix: most states actually solve in ≤20 moves; only a vanishing fraction are "far apart".

Definition — "far apart"

For a pair (u,v), call it far apart if d(u,v)≥21, otherwise close.

Let φ:= "at least 0.03% of pairs are far apart". Lemma 6: if φ holds, the probability of 500,000 samples turning up zero far-apart pair is:

(1−0.0003)500,000<7.02×10−66.

Yet they observed exactly zero. So φ is overwhelmingly false: ≤0.03% of pairs are far apart, and on the close pairs d≤20, hence C=20. Hoeffding's coefficient drops from 2052 to 202, cutting sample size by ≈105 times.

Where the Human's number comes from: 7-step beginner's method

Appendix A presents a generous-but-rigorous beginner's-method bound. Step by step:

White cross
20
White corners
60
Edges of the second layer
80
Yellow cross
18
Permuting yellow edges
21
Permuting yellow corners
24
Orienting yellow corners
42
Total (Lemma 8 = Human's Number)
265

Comically loose — any beginner averages <100 moves — but as a worst case that you can prove in a single page it's plenty.

After the trickμ^​close​≤18.4189 (Lemma 7, 500k samples) + μ≤μclose​+0.03%⋅205, so μ≤18.4804, hence D<2⋅18.4804=36.9608; integer ⇒ D≤36. Union-bound total error ≤10−10.

8. How to sample cube states uniformly

The theorem is proven, but you still need a true uniform sampler. The paper invokes the classical Fundamental Theorem of Cubology:

Theorem 3 — Cubology

If you take all corners + edges off and reattach them, the reassembly is reachable from solved iff (a) corner sign = edge sign (parity); (b) corner twists sum to 0(mod3) (twist sum); (c) edge flips sum to 0(mod2) (flip sum).

The three constraints partition Z2​×Z3​×Z2​=12 equivalence classes; exactly one is legal. Naive "uniform permute + reject" averages 12 retries. The paper picks the smarter route: after one random reassembly, apply at most three deterministic correctives:

  • Flip the F-U edge (toggles edge-flip parity)
  • Swap F-U-L and F-U-R corners (flips corner parity)
  • Twist the D-U-R corner clockwise (shifts twist sum mod 3)

The 12 invalid operations fibre every state above the legal subgroup; reassemble once + apply the three correctives gives 1 effective sample per pass.

Compare: WCA scramblersThis is identical to what cubing.js / TNoodle do at every WCA competition — pick a random state, solve via Kociemba, reverse the solution as the scramble. The live-sampler section below taps the same code path.

9. Experiment: 500k samples, 5 hours, Kociemba

The paper's experiment ran on a MacBook Pro M3 (36 GB RAM, 16 cores) using the open-source Kociemba implementation efrantar/rob-twophase. Results:

  • Total samples |S|: 500,679
  • Empirical mean μ^​: 18.3190
  • Longest solution: 20 HTM (matches Rokicki's 2010 exact diameter)
  • Wall time: ≈ 5 hours
  • Total error probability: <10−10

The exact counts of Figure 5, redrawn with a log-toggle and per-bin hover:

…

10. Run it yourself — in-browser Kociemba

The button below spins up a Web Worker that, in a loop, (i) draws a uniformly random 3×3 state, (ii) solves via Kociemba, (iii) histograms the solution length. Watch μ^​ creep toward the paper's 18.3189. Your samples are overlaid in green on the histogram above.

…

11. History of bounds (paper Table 1)

From Thistlethwaite's 52-move bound in 1981 to the 20/20 closure in 2010 — three decades of gap-shrinking. Each cell is one historical (lower, upper) snapshot:

1981
18–52
1990
18–42
1992
18–39
1992
18–37
1995
18–29
1995
20–29
2005
20–28
2006
20–27
2007
20–26
2008
20–25
2008
20–23
2008
20–22
2010
20–20

The three "numbers" side by side

NameValueTightnessCompute to verifyWho can auditSource
God's20exact35 CPU·yrmust re-run the Google-scale solveRokicki 2010
Demigod36upper, >10⁻¹⁰ confidence~5 h MacBookone laptop overnightMerino & Subercaseaux 2024
Human's205upper, deterministic but loosepen and paperanyone who can solve a Rubik's cubePaper App. A

12. Generalisation — what else gets a demigod number

Two ingredients: (i) vertex transitivity (so D<2μ holds); (ii) efficient uniform sampling. Every Cayley graph gives (i) for free; (ii) depends on the group's computable structure. The paper names several candidates:

Puzzle|G|Known diameterdemigod doable?
2×23.67×10⁶11BFS finishes the full graph — no need
Pyraminx933,12011BFS suffices
Skewb3.15×10⁶11BFS suffices
Square-1 (face)2.4×10¹⁴31Chen 2017 used 722GB BFS; demigod is the citizen-grade alternative
Megaminx1.0×10⁶⁸48–~110 (HTM)✓ — needs a megaminx solver; D unknown, demigod gives an upper bound
5×5×52.83×10⁷⁴?–~120 (OBTM)✓ — reduction solvers exist; gap is wide, demigod immediately useful
15-puzzle on a torus~10¹³?✓ — paper explicitly cites this as a target

The 5×5 lower/upper-bound gap is wider than a factor of 2 — exactly where demigod buys the most. So Hirata's 2024 papers (refs [12,13] in the paper) attack the same problem via girth and branching factor — complementary to this approach.

13. A side note on probabilistic vs traditional proof

The paper's closing argument is worth recording: it places demigod-class results on the spectrum of plausibility alongside probabilistic primality tests and machine-assisted proofs (Kepler's conjecture). Whether N is prime is a deterministic fact, but in practice we accept "Miller-Rabin gives error ≤2−128" as if it were one.

The paper's closing summary: "if the cube's diameter were actually greater than 36, the probability of observing an empirical mean of ~18.3 and zero >20 samples in 500,000 draws is <10−10". Compare this to the chance of someone finding a SHA-256 collision overnight on a 2024 laptop — which is more believable?

Open question (paper poses)"Are there properties of finite mathematical objects that can only be certified efficiently to a high degree of confidence by probabilistic algorithms, but that we never can be certain of through a short proof?"

References

  1. Merino & Subercaseaux — "A Demigod's Number for the Rubik's Cube" (arXiv:2501.00144, 2024) — the subject of this page
  2. Rokicki, Kociemba, Davidson & Dethridge — "The Diameter of the Rubik's Cube Group Is Twenty" — SIAM Review 2014, peer-reviewed God's number = 20
  3. cube20.org — Rokicki team's hub
  4. Kociemba — Two-Phase Algorithm — 1992 original description
  5. efrantar/rob-twophase — the actual solver the paper used
  6. Herman & Pakianathan — "On the distribution of distances in homogeneous compact metric spaces" (2015) — the original of Theorem 2 in the more general metric-space setting
  7. Demaine et al. — "Algorithms for Solving Rubik's Cubes" (ESA 2011, arXiv:1106.5736) — rigorous Θ(N²/log N) for NxN
  8. Mulholland — "Permutation Puzzles: Rubik's Cube" — standard reference for the Fundamental Theorem of Cubology
  9. Paper supplementary materials (anonymous git) — 500k sample data + solution certificates
  10. Wikipedia — Vertex-transitive graph
  11. Wikipedia — Hoeffding's inequality
  12. /math/god — back to the all-WCA-events God's number overview
Cycle size ∣V∣=2n16
Third vertex xx = 3
uxv
D=d(u,v)=8
d(u,x)=3
d(v,x)=5
d(u,x)+d(v,x)=8≥8=D

Exact mean μ=2n−11​∑k=12n−1​min(k,2n−k)=4.2667
Compare 2μ=8.5333>8=D   ✓

Drag x around the cycle: every position satisfies d(u,x)+d(v,x)≥D (triangle inequality). Summing over all x and dividing by ∣V∣−1 gives 2μ≥∣V∣D/(∣V∣−1)>D — that's Theorem 2 on C2n​, on a single page.

Sample size ∣S∣500,000
10³10⁹
Tolerance t0.100
0.020.50
Bound constant C
Refined: on "close pairs" d(s) ∈ [0,20]
1010⁻310⁻710⁻1010⁻2010⁻3010³10⁴10⁵10⁶10⁷10⁸10⁹|S|Pr[fail]
Naive (C=205, plain Hoeffding)Refined (C=20, after close-pair argument)
Pr[fail] now
2.78×10^-11
Need |S| for 1% err
105,967
Need |S| for 10⁻⁷ err
336,225
Single-core (0.2 s/sample)
1.2 d
8-core wall
3.5 h
Implied D ≤
37 HTM

How to read: at ∣S∣=500,000, t=0.10, Hoeffding gives Pr[∣μ^​−μ∣ ≥ 0.10] ≤ 2.78×10^-11. Dropping C from 205 to 20 cuts the required samples by (205/20)2≈105 × for the same accuracy.

Y axis
Paper
∣S∣=500,679, μ^​≈18.3190
50%40%30%20%10%0%11121314151617181920Kociemba solution length d (HTM)54k216k212kμ̂ ≈ 18.32
Merino & Subercaseaux 2024 (500k samples)

Of 500,000 random states, the longest Kociemba solution is exactly 20 HTM — matching Rokicki's proven diameter. Note the knife-edge: depths 18 and 19 alone account for ≈ 86%, while depth 11 appears just once. The paper combines this with Hoeffding to lock in μ≤18.4804.

Samples |S|
0
Running μ̂
—
Implied D ≤ (D < 2μ̂+0.36)
—
Hoeffding fail (t=0.1, C=20)
—

How it works: each sample calls cubing.js's randomScrambleForEvent('333') — picks a uniformly random state in a worker, solves it with Kociemba, returns the solution length as a bound on d(s). cubing.js clamps WCA scrambles to ≥18 moves so the running mean here biases slightly upward; the true μ is ≈ 18.32. ~50–300 ms per sample.